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GL Y COS IDAS E ENZYMES 

This application is a continuation-in-part of pending 
patent application 08/583,787 filed January 11, 1996. 

This invention relates to newly identified 
polynucleotides, polypeptides encoded by such 
polynucleotides, the use of such polynucleotides and 
polypeptides, as well as the production and isolation of such 
polynucleotides and polypeptides. More particularly, the 
polynucleotides and polypeptides of the present invention has 
been putatively identified as glucosidases , a-galactosidases , 
/J-galactosidases, S-mannos idases , S-mannanases , 
endoglucanases , and pullalanases . 

The glycosidic bond of 0-galactosides can be cleaved by 
different classes of enzymes: (i) phospho-0-galactosidases 
(EC3.2.L.85) are specific for a phosphorylated substrate 
generated via phosphoenolpyruvate phosphotransferase system 
(PTS) -dependent uptake; (ii) typical 0-galactosidases (EC 
3.2.1.23), represented by the Escherichia coli LacZ enzyme, 
which are relatively specific for 0-galactosides; and (iii) 
0 -glucosidases (EC 3.2.1.21) such as the enzymes of 
Agrobacterium faecalis , Clostridium thexmocellum, Pyrococcus 
furiosus or Sulfolobus solfataricus (Day, A.G. and Withers, 
S.G., (1986) Purification and characterization of a /?- 
glucosidase from Alcaligenes faecalis . Can. J. Biochem. Cell. 
Biol. 64, 914-922; Kengen, S.W.M., et al . (1993) Eur. J. 
Biochem., 213, 305-312; Ait, N. , Cruezet, N. and Cattaneo, J. 
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(1982) Properties of /3-glucosidase purified from Cloetridi 
thermocellum. J. Gen. Microbiol. 128, 569-577; Grogan, D.W. 
(1991) Evidence that j3-galactosidase of Sulfolobus 
Golfataricus is only one of several activities of a 
thermostable 0-D-glycodiase . Appl. Environ. Microbiol. 57, 
1644-1649). Members of the latter group, although highly 
specific with respect to the /J-anomeric configuration of the 
glycosidic linkage, often display a rather relaxed substrate 
specificity and hydrolyse 0-glucosides as well as jS-fucosides 
and 0-galactosides . 

Generally, a-galactosidases are enzymes that catalyze 
the hydrolysis of galactose groups on a polysaccaride 
backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, 6-mannanases are enzymes that catalyze the 
hydrolysis of mannose groups internally on a polysaccaride 
backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. S-mannosidases 
hydrolyze non-reducing, terminal mannose residues on a 
mannose -containing polysaccharide and the cleavage of di- or 
oligosaccaharides comprising mannose groups . 

Guar gum is a branched galactomannan polysaccharide 
composed of 0-1,4 linked mannose backbone with a-1,6 linked 
galactose sidechains. The enzymes required for the 

degradation of guar are /3-mannanase , /3-mannosidase and a- 
galactosidase . jS-mannanase hydrolyses the mannose backbone 
internally and jS-mannosidase hydrolyses non-reducing, 
terminal mannose residues. a-galactosidase hydrolyses a- 
linked galactose groups . 

Galactomannan polysaccharides and the enzymes that 
degrade them have a variety of applications. Guar i6 
commonly used as a thickening agent in food and is utilized 
in hydraulic fracturing in oil and gas recovery. 
Consequently, galactomannanases are industrially relevant for 
the degradation and modification of guar. Furthermore, a 
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need exists for thermostable galactomannases that are active 
in extreme conditions associated with drilling and well 
stimulation. 

There are other applications for these enzymes in 
various industries, such as in the beet sugar industry. 20- 
30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets. Raw beet sugar can contain a small amount of 
raffinose when the sugar beets are stored before processing 
and rotting begins to set in. Raffinose inhibits the 
crystallization of sucrose and also constitutes a hidden 
quantity of sucrose. Thus, there is merit to eliminating 
raffinose from raw beet sugar. cr-Galactosidase has also been 
used as a digestive aid to break down raffinose, stachyose, 
and verbascose in such foods as beans and other gassy foods. 

/?-Galactosidases which are active and stable at high 
temperatures appear to be superior enzymes for the production 
of lactose- free dietary milk products (Chaplin, M.F. and 
Bucke, C. (1990) In: Enzyme Technology, pp. 15 9-160, 
Cambridge University Press, Cambridge, UK) - Also, several 
studies have demonstrated the applicability of (3- 
galactosidases to the enzymatic synthesis of oligosaccharides 
via transglycosylation reactions (Nilsson, K.G.I. (1988) 
Enzymatic synthesis of oligosaccharides. Trends Biotechnol . 
6, 156-264; Cote , G.L. and Tao , B . Y . (1990) Oligosaccharide 
synthesis by enzymatic transglycosylation. Glycoconjugate J. 
7 > 14 5-162) . Despite the commercial potential, only a few 0- 
galactosidases of thermophiles have been characterized so 
far. Two genes reported are 0-galactoside -cleaving enzymes 
of the hyperthermophilic bacterium Thexmotoga maritime, one 
of the most thermophilic organotrophic eubacteria described 
to date (Huber, R. , Langworthy, T.A. , Konig, H., Thomrn, M. , 
Woese, C,R., Sleytr, U.B. and Stetter, K.O. (1986) r. martima 
sp. nov. represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90°C, Arch. Microbiol. 
144, 324-3 33) one of the most thermophilic organotrophic 
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eubacteria described to date. The gene products have been 
identified as a 0-galactosidase and a 0-glucosidase . 

Pullulanase ia well known as a debranching enzyme of 
pullulan and starch. The enzyme hydrolyzes a-l , 6 -glucosidic 
linkages on these polymers. Starch degradation for th 
eproduction or sweeteners (glucose or maltose) is a very 
important industrial application of this enzyme. The 
degradation of starch is developed in two stages. The first 
stage involves the liquefaction of the substrate with cr- 
amylase, and the second stage, or saccharif ication stage, is 
performed by fi-amylase with pullalanase added as a 
debranching enzyme, to obtain better yields. 

Endoglucanasee can be used in a variety of industrial 
applications. For instance, the endoglucanases of the 
present invention can hydrolyze the internal S-l , 4 -glycosidic 
bonds in cellulose, which may be used for the conversion of 
plant biomass into fuels and chemicals. Endoglucanases also 
have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the 
fruit juice and brewing industry for th eclarif ication and 
extraction of juices. 

The polynucleotides and polypeptides of the present 
invention have been identified as glucosidases , a- 
galactosidases, /3-galactosidases , S-mannosidases , S- 
mannanases, endoglucanases, and pullalanases as a result of 
their enzymatic activity. 

In accordance with one aspect of the present invention, 
there are provided novel enzymes, as well as active 
fragments , analogs and derivatives thereof . 

In accordance with another aspect of the present 
invention, there are provided isolated nucleic acid molecules 
encoding the enzymes of the present invention including 
mRNAs. cDNAs, genomic DNAs as well as active analogs and 
fragments of such enzymes . 
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In accordance with another aspect of the present 
ln ™" M Ch »" are P™*— i~l.t.d nucleic acid molecuUe 
encoding mature polypeptides expressed by the DNA contained 
in ATCC Deposit Ho. 97379. 

in accordance with yet a further aspect of the present 
7""°"' Ch " e iS * P-=ess for producing such 

polypeptides by recombinant techniques comprising culturing 
recombinant proKaryotic and/or • eufcaryotic host cells 
containing a nucleic acid sequence of the present invention.' 
under conditions promoting expression of said enzymes and 
subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present 
invention, there is provided a process for utilizing such 
enzymes, or polynucleotides encoding such enzymes for 
hydrolyzing lactose to galactose and glucose for use in the 
food processing industry, the pharmaceutical industry, for 
example, to treat intolerance to lactose, as a diagnostic 
reporter molecule, in com wet milling, in the fruit juice 
industry, in baking, in the textile industry and in the 
detergent industry. 

in accordance with yet a further aspect of the present 
invention, there is provided a process for utilizing such 
enzymes for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non- reducing terminal mannose 
rescues. Further polysaccharides such as galactomannan and 
the enzymes according to the invention that degrade them have 
a varxtey of applications. Guar gum is commonly used as a 
thickening agent in food and also is utilized in hydraulic 
fracturing in oil and gas recovery. Consequently, mannanases 
are industrially relevant for the degradation and 
modification of guar gums. Furthermore, a need exists for 
thermostable mannases that are active in extreme conditions 
associated with drilling and well stimulation. 

In accordance with yet a further aspect of the present 
invention, there are also provided nucleic acid probes 
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specifically hybridize to a nucleic acid sequence of the 
present invention. 

In accordance with yet a further aspect of the present 
invention, there is provided a process for utilizing such 
enzymes , or polynucleotides encoding such enzymes ( for in 
vitro purposes related to scientific research, for example, 
to generate probes for identifying similar sequences which 
might encode similar enzymes from other organisms by using 
certain regions, i.e., conserved sequence regions, of the 
nucleotide sequence . 

These and other aspects of the present invention should 
be apparent to those skilled in the art from the teachings 
herein . 

Brief Description of the Drawings 

The following drawings are illustrative of embodiments 
of the invention and are not meant to limit the 6cope of the 
invention as encompassed by the claims . 

Figure 1 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of M11TL of the 
present invention. Sequencing was performed using a 378 
automated DNA sequencer for all sequences of the present 
invention (Applied Biosystems , Inc.) . 

Figure 2 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of OC1/4V-33B/G . 

Figure 3 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of F1-12G. 

Figure 4 are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of 9N2-31B/G. 

Figure 5 are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of MSB8-6G . 

Figure 6 are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of AEDII12RA-18B/G . 

Figure 7 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of GC74-22G. 
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Figure 8 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of VC1-7G1. 

Figure 9 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 37GP1. 

Figure 10 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GC2 . 

Figure 11 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GP2 . 

Figure 12 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 63GB1. 

Figure 13 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 0C1/4V. 

Figure 14 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GP3 . 

Definitions 

The term "gene" means the segment of DNA involved in 
producing a polypeptide chain; it includes regions preceding 
and following the coding region (leader and trailer) as well 
as intervening sequences (introns) between individual coding 
segments (exons) . 

A coding sequence is "operably linked to" another coding 
sequence when RNA polymerase will transcribe the two coding 
sequences into a single mRNA, which is then translated into 
a single polypeptide having amino acids derived from both 
coding sequences . The coding sequences need not be 
contiguous to one another so long as the expressed sequences 
ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by 
recombinant DNA techniques; i.e., produced from cells 
transformed by an exogenous DNA construct encoding the 
desired enzyme. "Synthetic" enzymes are those prepared by 
chemical synthesis . 

A DNA "coding sequence of" or a "nucleotide sequence 
encoding" a particular enzyme, is a DNA sequence which is 
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transcribed and translated into an enzyme when placed under 
the control of appropriate regulatory sequences. 

Summary of the Inventicpn 

In accordance with an aspect of the present invention, 
there are provided isolated nucleic acids (polynucleotides) 
which encode for the mature enzymes having the deduced amino 
acid sequences of Figures 1-14 (SEQ ID NO£:15-28) . 

In accordance with another aspect of the present 
invention, there are provided isolated polynucleotides 
encoding the enzymes of the present invention. The deposited 
material is a mixture of genomic clones comprising DNA 
encoding an enzyme of the present invention . Each genomic 
clone comprising the respective DNA has been inserted into a 
pBluescript vector (Stratagene, La Jolla, CA) . The deposit 
has been deposited with the American Type Culture Collection, 
123 01 Parklawn Drive , Rocicville , Maryland 20852, USA , .on 
December 13, 1995 and assigned ATCC Deposit No. 97379. 

The deposit (s) have been made under the terms of the 
Budapest Treaty on the International Recognition of the 
deposit of micro-organisms for purposes of patent procedure. 
The strains will be irrevocably and without restriction or 
condition released to the public upon the issuance of a 
patent. These deposits are provided merely as convenience to 
those of skill in the art and are not an admission that a 
deposit be required under 35 U.S.C. §112. The sequences of 
the polynucleotides contained in the deposited materials , as 
well as the amino acid sequences of the polypeptides encoded 
thereby, are controlling in the event of any conflict with 
any description of sequences herein. A license may be 
required to make, use or sell the deposited materials, and no 
such license is hereby granted. 

Detailed Description of the Invention 

The polynucleotides of this invention were originally 
recovered from genomic gene libraries derived from the 
following organisms : 
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M11TL is a new species of Desulfurococcua isolated from 
Diamond Pool in Yellowstone National Park. The organism 
grows optimally at 85-88°C, pH 7.0 in a low salt medium 
containing yeast extract, peptone, and gelatin as substrates 
with a N 2 /C0 2 gas phase. 

OC1/4V is from the genus Thermotoga. The organism was 
isolated from Yellowstone National Park . It grows optimally 
at 75°C in a low salt medium with cellulose as a substrate 
and N 2 in gas phase. 

Pyrococcus furloaus VC1 is from the genus PyrococcxiG. 
VC1 was isolated from Vulcano, Italy. It grows optimally at 
lOO^C in a high salt medium (marine) containing elemental 
sulfur, yeast extract, peptone and starch as substrates and 
N 2 in gas phase . 

Staphylocihennus marlnus Fl is a from the genus 
Staphylothermus . Fl was isolated from Vulcano, Italy. It 
grows optimally at 85°C / pH €.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates 
and N 2 in gas phase. 

Thezmococcus 9N-2 is from the genus Thermococcus 9N-2 
was isolated from diffuse vent fluid in the East Pacific 
Rise. It is a strict anaerobe that grows optimally at 87°C. 

Thermotoga maritlma MSB 6 is from the genus Thexmotogo, 
and was isolated from Vulcano, Italy. MSB 8 grows optimally 
at 85°C, pH 6.5 in a high salt medium (marine) containing 
starch and yeast extract as substrates and N 2 in gas phase . 

Thermococcus alcallphllus AEDII12RA is from the genus 
Thezmococcue . AEDII12RA grows optimally at 85 °C, pH 9.5 in 
a high salt medium (marine) containing polysulfides and yeast 
extract as substrates and N 2 in gas phase . 

Thennococcus chitonophagus GC74 is from the genus 
Thcrmococcus . GC74 grows optimally at 85 °C, pH 6 . 0 in a high 
salt medium (marine) containing chitin, meat extract, 
elemental sulfur and yeast extract as substrates and N ; in gas 
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phase. AEPII la grows optimally at 85°C at pH 6 . 5 in marine 
medium under anaerobic conditions. It has many substrates. 
[Add descriptions of new organisms] 

Accordingly, the polynucleotides and enzymes encoded 
thereby are identified by the organism from which they were 
isolated, and are sometimes hereinafter referred to as 
"M11TL" (Figure 1 and SEQ ID NOS;l and 15) , n OCl/4V-33B/G" 
(Figure 2 and SEQ ID NOS;2 and 16), "F1-12G" (Figure 3 and 
SEQ ID NOS:3 and 17), "9N2-31B/G" (Figure 4 and SEQ ID NOS:4 
and 18) , n MSB8 n (Figure 5 and SEQ ID NOS:5 and 19) , 
n AEDII12RA-18B/G n (Figure 6 and SEQ ID NOS : 6 and 20), "GC74- 
22G" (Figure 7 and SEQ ID NOS : 7 and 21), n VCl-7Gl n (Figure 8 
and SEQ ID NOS : 8 and 22), "37GP1" (Figure 9 and SEQ ID NOS: 
9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 24), 
" 6GP2 " ( Figure 11 and SEQ ID NOS : 11 and 25 ) , " AEPI I la " 
(Figure 12 and SEQ ID NOS : 12 and 26) , "OC1/4V" (Figure 13 and 
SEQ ID NOS: 13 and 27) , and "6GP3 n (Figure 14 and SEQ ID 
NOS:28) . 

The polynucleotides and polypeptides of the present 
invention show identity at the nucleotide and protein level 
to known genes and proteins encoded thereby as shown in Table 
1 . 



Clone 


Gene / Prot e in witix 
^Closest Homology 


iPzroliein 
Identity 


Nucleic 

Acid 
Identity 


M11TL-2 9G 


Sulfolobus 
sulfataricus DSM 
1616/P1, &- 
galactosidase 


51% 


55% 


0C1/4V-33B/G 


Caldocellum 

s a c c h a r o 1 yt i cum , 

0-glucosidase 


52% 


57% 


S taphylothermus 
marinuB F1-12G 


Bacillus polymyxa, 
0- galactosidase 


36% 


48% | 
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Thermococcue 
9N2-31B/G 


Sulf olobus 

ouxi>aLai71CUo M. 1 

49255/MT4, £- 
galactosidase 


51% 


50* 


Thermo to^a 
maritima MSB8- 
€G 


Clostridium 
thermocellum bglB 


45% 


53* 


TtlGYTXlOCOCCUS 

«fc * * A 44 1 \m* \^ V» 0 

AEDII12RA-18B/G 


oaLinub poiyniyxd, 
0 -galactosidase 


34* 


48* 


Thenuococcus 
ciij. tonoDixaoTJs 
GC74-22G 


Sulf olobus 

OUlLaLaii.UUb AJ. V»V» 

49255/MT4, 0- 
galactosidase 


46* 


54* 


Pyrococcus 
7G1 


Sulf olobus 

/8 -galactosidase 


46.4* 


52.5* 


maritima a- 
oalactofiidase 
( 6GC2 ) 


Pediococcus 
pentosaceaus or- 


49* 


29* 


Thermo toga 

iliCfcX X L. A Hid 13 

mannanase 
( 6GP2 ) 


Aspergillus 

aCUlcaLUo 

mannanase 


56* 


37* 


AEPII la S- 

msnnopi "i Hanp 

IUGUM1WO UflDC 

(63GB1) 


Sulf olobus 

BUliawudiTiCUB o 

galact os idase 


78* 


56* 


OC1/4V 

endoglucanase 


Clostridium 
thermocellum endo- 
1 , 4 — is- 

endoglucanase 


65* 


43* 


Thermo toga 
marl t±ma 
pullalanase 
(6GP3) 


Caldocellum 
saccharolyticum a- 
destrom 6 
glucanohydralase 


72 


53 


Bankia gouldi 
mix 

Endoglucanaee 
(3 7GP1) 


None available 







The polynucleotides and enzymes of the present invention 
show homology to each other as shown in Table 2 . 
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Table 2 



Clone 


Gene /Protein with 
Closest Homology 


Protein 
Identity 


Nucleic 

Acid 
Identity 


marznus Fl - 12G 


ThermococcxiB 
AEDII12RA-18B/G, 
0-galactosidase, 
glucosidaae 


55% 


57% 


Th ezmococcuG 
9N2-31B/G 


Thermococcus 
chi tonophagus 
GC74-22G- 
glucosidase % 


74% 


66% 


Pyrococcus 
furioeue VC1- 
7G1 


Pyrococcus 
furiosus VC1-7B/G 
/3-galactosidase 


46.4% 


54% 



All the clones identified in Tables l and 2 encode 
polypeptides which have cr-glycosidase or /3-g lycos idase 
activity. 

This invention, in addition to the isolated nucleic acid 
molecules encoding the enzymes of the present invention, also 
provide substantially similar sequences . Isolated nucleic 
acid sequences are substantially similar if : (i) they are 
capable of hybridizing under conditions hereinafter 
described, to the polynucleotides of SEQ ID NOS:l-8; (ii) or 
they encode DNA sequences which are degenerate to the 
polynucleotides of SEQ ID N0S;l-8. Degenerate DNA sequences 
encode the amino acid sequences of SEQ ID NOS.-9-16, but have 
variations in the nucleotide coding sequences . As used 
herein, substantially similar refers to the sequences having 
similar identity to the sequences of the instant invention. 
The nucleotide sequences that are substantially the same can 
be identified by hybridization or by sequence comparison. 
Enzyme sequences that are substantially the same can be 
identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing . 



-12- 



WO 97/2541 7 PCT/US97/00092 

One means for isolating the nucleic acid molecules 
encoding the enzymes of the present invention is to probe a 
gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example: Current 
Protocols in Molecular Biology, Ausubel F.M. et al . (EDS.) 
Green Publishing Company Assoc. and John Wiley Interscience , 
New York, 1989, 1992) . It is appreciated to one skilled in 
the art that the polynucleotides of SEQ ID NOS;l-14 or 
fragments thereof (comprising at least 12 contiguous 
nucleotides), are particularly useful probes. Other 
particular useful probes for this purpose are hybridizable 
fragments to the sequences of SEQ ID NOS:l-14 (i.e., 
comprising at least 12 contiguous nucleotides) . 

With respect to nucleic acid sequences which hybridize 
to specific nucleic acid sequences disclosed herein, 
hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. 
As an example of oligonucleotide hybridization, a polymer 
membrane containing immobilized denatured nucleic acids is 
first prehybridized for 3 0 minutes at 45°C in a solution 
consisting of 0-9 M NaCl, 50 mM NaH 2 P0 4 , pH 7.0, 5.0 mM 
Na 2 EDTA, 0 . 5% SDS , 10X Denhardt ' s , and 0 . 5 mg/xnL 
polyriboadenylic acid. Approximately 2 X 10 7 cpm (specific 
activity 4-9 X 10 s cpm/ug) of 32 P end-labeled oligonucleotide 
probe are then added to the solution. After 12-16 hours of 
incubation, the membrane is washed for 3 0 minutes at room 
temperature in IX SET (150 mM NaCl, 20 mM Tris hydrochloride, 
pH 7.8, 1 mM Na 2 EDTA) containing 0.5* SDS, followed by a 30 
minute wash in fresh IX SET at Tm 10°C for the oligo- 
nucleotide probe. The membrane is then exposed to auto- 
radiographic film for detection of hybridization signals. 

Stringent conditions means hybridization will occur only 
if there is at least 90% identity, preferably at least 95% 
identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of 
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a 100 bps sequence that is 95 bps in length has 95% identity 
with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et al . , Molecular Cloning, A Laboratory Manual, 2d 
Ed., Cold Spring Harbor Laboratory (1989) which is hereby 
incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 9 5 
bps in length has 95% identity with the 100 bps sequence from 
which it is obtained. 

As used herein, a first DNA (RNA) sequence is at least 
70% and preferably at least 80% identical to another DNA 
(RNA) sequence if there is at least 70% and preferably at 
least a 80% or 90% identity, respectively, between the bases 
of the first sequence and the bases of the another sequence, 
when properly aligned with each other, for example when 
aligned by BLASTN. 

"Identity" as the term is used herein, refers to a 
polynucleotide sequence which comprises a percentage of the 
same bases as a reference polynucleotide (SEQ ID NOS:l-8) . 
For example, a polynucleotide which is at least 90% identical 
to a reference polynucleotide, has polynucleotide bases which 
are identical in 90% of the bases which make up the reference 
polynucleotide and may have different bases in 10% of the 
bases which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which 
differ from the reference polynucleotide such that the 
changes are silent changes, for example the change do not 
alter the amino acid sequence encoded by the polynucleotide. 
The present invention also relates to nucleotide changes 
which result in amino acid substitutions , additions, 
deletions, fusions and truncations in the polypeptide encoded 
by the reference polynucleotide. In a preferred aspect of 
the invention these polypeptides retain the same biological 
action as the polypeptide encoded by the reference 
polynucleotide . 
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It is also appreciated that such probes can be and are 
preferably labeled with an analytically detectable reagent to 
facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent 
dyes or enzymes capable of catalyzing the formation of a 
detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen 
such sources for related sequences . 

The polynucleotides of this invention were recovered 
from genomic gene libraries from the organisms listed in 
Table 1. For example, gene libraries can be generated in the 
Lambda ZAP II cloning vector (Stratagene Cloning Systems). 
Mass excisions can be performed on these libraries to 
generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the 
protocols /methods hereinafter described- 

The excision libraries are introduced into the E. coli 
strain BW14 893 F'kanlA. Expression clones are then 

identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other 
glycosidases are identified and repurified. The 
polynucleotides, and enzymes encoded thereby, of the present 
invention, yield the activities as described above. 

The coding sequences for the enzymes of the present 
invention were identified by screening the genomic DMAs 
prepared for the clones having glucosidase or galactosidase 
activity . 

An example of such an assay is a high temperature filter 
assay wherein expression clones were identified by use of 
high temperature filter assays using buffer 2 (see recipe 
below) containing 1 mg/ml of the substrate 5-bromo-4 -chloro- 
3-indolyl-0-D-glucopyranoside (XGLU) (Diagnostic Chemicals 
Limited or Sigma) after introducing an excision library into 
the E. coli strain BW14893 F'kanlA. Expression clones 
encoding XGLUases were identified and repurified from M11TL, 
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OC1/4V, Pyrococcus furiosus VC1 , Staphylothemus marinus Fl, 
Thermococcus 9N-2, Thermotoga maritima MSB 8 , Thermococcus 
alcaliphilus AEDII12RA, and Thermococcue chitonophague GC74 . 

Z-buf f er: (referenced in Miller, J.H. (1992) A Short 
Course in Bacterial Genetics, p. 445-) 

per liter: 

Na 2 HP0 4 - 7H 2 0 16 . Ig 

NaH 2 PO 4 -7H 2 0 5 . 5g 

KC1 0 . 75g 

MgS0 4 -7H 2 0 0.246g 
0-mercaptoethanol 2 . 7ml 

Adjust pH to 7.0 

High Temperature Filter Assay 

(1) The f factor f 'kan (from E. call strain CSH118) (l) was 
introduced into the pho-pnh-lac-strain BW14893(2). 
BW138 93(2) . The filamentous phage library was plated on 
the resulting strain, BW14893 F'kan. {Miller, J.H. 
(1992) A Short Course in Bacterial Genetics; Lee, K.S., 
Metcalf, et al., (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes, J. 
Bacteriol., 174:2501-2510. 

(2) After growth on 100 mm LB plates containing 100 fig/ml 
ampicillin, 80 fig/ml nethicillin and ImM IPTG, colony 
lifts were performed using Millipore HATF membrane 
filters. 

(3) The colonies transferred to the filters were lysed with 
chloroform vapor in 150 mm glass petri dishes . 

(4) The filters were transferred to 100 mm glass petri 
dishes containing a piece of Whatman 3 MM filter paper 
saturated with buffer. 

(a) when testing for galactosidase activity 
(XGALase) , 3 MM paper was saturated with Z buffer 
containing 1 mg/ml XGAL (ChemBridge Corporation) . 
After transferring filter bearing lysed colonies to 
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the glass petri dish, placed dish in oven at 80- 
85°C. 

(b) when testing for glucosidaee (XGLUase) , 3 MM 
paper was saturated with Z buffer containing 1 
mg/ml XGLU. After transferring filter bearing 
lysed colonies to the glass petri dish, placed dish 
in oven at 80-85°C. 
(5) 'Positives' were observed as blue spots on the filter 
membranes. Used the following filter rescue technique 
to retrieve plasmid from lysed positive colony. Used 
pasteur pipette (or glass capillary tube) to core blue 
spots on the filter membrane. Placed the small filter 
disk in an Eppendorf tube containing 20 fil water. 
Incubated the Eppendorf tube at 75 °C for 5 minutes 
followed by vortexing to elute plasmid DNA off filter. 
This DNA was transformed into electrocompetent E. coli 
cells DH10B for Thermatoga maritima MSB8 -6G, 
Staphylothermus marinus F1-12G, Thermococcus AEDII12RA- 
1SB/G, Thermococcus chitonophagus GC74-22G, M11T1 and 
0C1/4V. Electrocompetent BW14893 F'kanlA E. coli were 
used for Thermococcus 9N2-31B/G, and Pyrococcue furioaus 
VC1-7G1. Repeated filter-lift assay on transformation 
plates to identify 'positives'. Return transformation 
plates to 37°C incubator after filter lift to regenerate 
colonies. Inoculate 3 ml LB liquid containing 100 fig/ml 
ampicillin with repurified positives and incubate at 
37° C overnight. Isolate plasmid DNA from these cultures 
and sequence plasmid insert . In some instances where 
the plates used for the initial colony lifts contained 
non-confluent colonies, a specific colony corresponding 
to a blue spot on the filter could be identified on a 
regenerated plate and repurified directly, instead of 
using the filter rescue technique. 

Another example of such an assay is a variation of the 
high temperature filter assay wherein colony- laden filters 
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are heat -killed at different temperatures (for example, 105°C 
for 20 minutes) to monitor thermostability. The 3 MM paper is 
saturated with different buffers (i.e., 100 mM NaCl , 5 mM 
MgCl 2 , 100 mM Tris-Cl (pH 9.5)) to determine enzyme activity 
under different buffer conditions. 

A /J-glucosidase assay may also be employed, wherein 
Glcp/JNp is used as an artificial substrate (aryl-/3- 
glucosidase) . The increase in absorbance at 405 nm as a 
result of p-nitrophenol (pNp) liberation was followed on a 
Hitachi U-1100 spectrophotometer, equipped with a 
thermos tatted cuvette holder. The ssays may be performed at 
80°C or 90°C in closed 1-ml qua.xz cuvette. A standard 
reaction mixture contains 150 mM trisodium substrate, pH 5 . 0 
(at 80°C) , and 0.95 mM pNp derivative pNp = 0.561 mM 1 • cm 1 ) . 
The reaction mixture is allowed to reach the desired 
temperature, after which the reaction is started by injecting 
an appropriate amount of enzyme (1.06 ml final volume) . 

1 u 0-glucosidase activity is defined as that amount 
required to catalyze the formation of 1.0 jimol pNp/min. D- 
cellobiose may also be used as a substrate. 

An ONPG assay for /3-galactosidase activity is described 
by Miller, J.H. (19 92) A Short Course in Bacterial Genetics 
and Mill, J.H. (1992) Experiments in Molecular Genetics, the 
contents of which are hereby incorporated by reference in 
their entirety. 

A quantitative fluorometric assay for /3-galactosidase 
specific activity is described by Youngman P., (1987) 
Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram- Positive Bacteria. 
In Plasmids: A Practical approach (ed. K. Hardy) pp 79-103. 
IRL Press, Oxford. A description of the procedure can be 
found in Miller (1992) p. 75-77, the contents of which are 
incorporated by reference herein in the^r entirety. 

The polynucleotides of the present invention may be in 
the form of DNA which DNA includes cDNA, genomic DNA, and 
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synthetic DNA . The DNA may be double -stranded or single- 
stranded, and if single stranded may be the coding strand or 
non-coding (ant i -sense) strand. The coding sequences which 
encodes the mature enzymes may be identical to the coding 
sequences shown in Figures 1-8 (SEQ ID NOS: 1-8) or may be a 
different coding sequence which coding sequence, as a result 
of the redundancy or degeneracy of the genetic code, encodes 
the same mature enzymes as the DNA of Figures 1-14 (SEQ ID 
NOS : 1-14) . 

The polynucleotide which encodes for the mature enzyme 
of Figures 1-14 (SEQ ID NOS: 15-28) may include, but is not 
limited to: only the coding sequence for the mature enzyme; 
the coding sequence for the mature enzyme and additional 
coding sequence such as a leader sequence or a proprotein 
sequence; the coding sequence for the mature enzyme (and 
optionally additional coding sequence) and non-coding 
sequence, such as introns or non-coding sequence 5' and/or 3' 
of the coding sequence for the mature enzyme . 

Thus, the term "polynucleotide encoding an enzyme 
(protein) n encompasses a polynucleotide which includes only 
coding sequence for the enzyme as well as a polynucleotide 
which includes additional coding and/or non-coding sequence . 

The present invention further relates to variants of the 
hereinabove described polynucleotides which encode for 
fragments, analogs and derivatives of the enzymes having the 
deduced amino acid sequences of Figures 1-14 (SEQ ID NOS : 15- 
28) . The variant of the polynucleotide may be a naturally 
occurring allelic variant of the polynucleotide or a non- 
naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides 
encoding the same mature enzymes as shown in Figures 1-14 
(SEQ ID NOS: 15-28) as well as variants of such 
polynucleotides which variants encode for a fragment, 
derivative or analog of the enzymes of Figures 1-14 (SEQ ID 
NOS: 15-28). Such nucleotide variants include deletion 
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variants, substitution variants and addition or insertion 
variants . 

As hereinabove indicated, the polynucleotides may have 
a coding sequence which is a naturally occurring allelic 
variant of the coding sequences shown in Figures 1-14 (SEQ ID 
NOS:l-14). As known in the art, an allelic variant is an 
alternate form of a polynucleotide sequence which may have a 
substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter the function 
of the encoded enzyme . 

Fragments of the full length gene of the present 
invention may be used as a hybridization probe for a cDNA or 
a genomic library to isolate the full length DNA and to 
isolate other DNAs which have a high sequence similarity to 
the gene or similar biological activity. Probes of this type 
preferably have at least 10 , preferably at least 15, and even 
more preferably at least 3 0 bases and may contain, for 
example, at least 50 or more bases. The probe may also be 
used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the 
complete gene including regulatory and promotor regions, 
exons, and introne . An example of a screen comprises 
isolating the coding region of the gene by using the known 
DNA sequence to synthesize an oligonucleotide probe. Labeled 
oligonucleotides having a sequence complementary to that of 
the gene of the present invention are used to screen a 
library of genomic DNA to determine which members of the 
library the probe hybridizes to. 

The present invention further relates to 
polynucleotides which hybridize to the hereinabove -described 
sequences if there is at least 70% , preferably at least 9Cv, 
and more preferably at least 95* identity between t~ e 
sequences . The present invention particularly relates - o 
polynucleotides which hybridize under stringent conditions to 
the hereinabove -described polynucleotides. As herein used, 
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the term "stringent conditions" means hybridization will 
occur only if there is at least 95% and preferably at least 
97% identity between the sequences. The polynucleotides 
which hybridize to the hereinabove described polynucleotides 
in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the 
mature enzyme encoded by the DMA of Figures 1-14 (SEQ ID 
NOS: 1-14) . 

Alternatively, the polynucleotide may have at least 15 
bases, preferably at least 30 bases, and more preferably at 
least 50 bases which hybridize to any part of a 
polynucleotide of the present invention and which has an 
identity thereto, as hereinabove described, and which may or 
may not retain activity. For example, such polynucleotides 
may be employed as probes for the polynucleotides of SEQ ID 
NOS:l-14, for example, for recovery of the polynucleotide or 
as a diagnostic probe or as a PCR primer. 

Thus, the present invention is directed to 
polynucleotides having at least a 70% identity, preferably at 
least 90% identity and more preferably at least a 95% 
identity to a polynucleotide which encodes the enzymes of SEQ 
ID NOS: 15-28 as well as fragments thereof, which fragments 
have at least 15 bases, preferably at least 3 0 bases and most 
preferably at least 50 bases, which fragments are at least 
90% identical, preferably at least 95% identical and most 
preferably at least 97% identical under stringent conditions 
to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which 
have the deduced amino acid sequences of Figures 1-14 (SEQ ID 
NOS: 15-28) as well as fragments, analogs and derivatives of 
such enzyme. 

The terms "fragment," "derivative" and "analog" when 
referring to the enzymes of Figures 1-14 (SEQ ID NOS: 15-28) 
means enzymes which retain essentially the same biological 
function or activity as such enzymes. Thus, an analog 
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includes a proprotein which can be activated by cleavage of 
the proprotein portion to produce an active mature enzyme. 

The enzymes of the present invention may be a 
recombinant enzyme , a natural enzyme or a synthetic enzyme , 
preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of 
Figures 1-14 (SEQ ID NOS: 15-28) may be (i) one in which one 
or more of the amino acid residues are substituted with a 
conserved or non-conserved amino acid residue (preferably a 
conserved amino acid residue) and such substituted amino acid 
residue may or may not be one encoded by the genetic code, or 
(ii) one in which one or more of the amino acid residues 
includes a substituent group, or (iii) one in which the 
mature enzyme is fused with another compound, such as a 
compound to increase the half -life of the enzyme (for 
example, polyethylene glycol), or (iv) one in which the 
additional amino acids are fused to the mature enzyme, such 
as a leader or secretory sequence or a sequence which is 
employed for purification of the mature enzyme or a 
proprotein sequence. Such fragments, derivatives and analogs 
are deemed to be within the scope of those skilled in the art 
from the teachings herein. 

The enzymes and polynucleotides of the present invention 
are preferably provided in an isolated form, and preferably 
are purified to homogeneity. 

The term "isolated" means that the material is removed 
from its original environment (e.g., the natural environment 
if it is naturally occurring) . For example, a naturally- 
occurring polynucleotide or enzyme present in a living animal 
is not isolated, but the same polynucleotide or enzyme, 
separated from some or all of the coexisting materials in the 
natural system, is isolated. Such polynucleotides could be 
part of a vector and/or such polynucleotides or enzymes could 
be part of a composition, and still be isolated in that such 
vector or composition is not part of its natural environment. 
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The enzymes of the preeent invention include the enzymes 
of SEQ ID NOS; 15-28 (in particular the mature enzyme) as well 
as enzymes which have at least 70% similarity (preferably at 
least 70% identity) to the enzymes of SEQ ID NOS: 9-16 and 
more preferably at least 90% similarity (more preferably at 
least 90% identity) to the enzymes of SEQ ID NOS; 15-28 and 
still more preferably at least 95% similarity (still more 
preferably at least 95% identity) to the enzymes of SEQ ID 
NOS : 9-16 and also include portions of such enzymes with such 
portion of the enzyme generally containing at least 3 0 amino 
acids and more preferably at least 50 amino acids. 

As known in the art "similarity" between two enzymes is 
determined by comparing the amino acid sequence and its 
conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme . 

A variant, i.e. a ■ fragment ", "analog" or "derivative" 
polypeptide, and reference polypeptide may differ in amino 
acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in 
any combination. 

Among preferred variants are those that vary from a 
reference by conservative amino acid substitutions . Such 
substitutions are those that substitute a given amino acid in 
a polypeptide by smother amino acid of like characteristics. 
Typically seen as conservative substitutions are the 
replacements, one for another, among the aliphatic amino 
acids Ala, Val, Leu and lie; interchange of the hydroxyl 
residues Ser and Thr, exchange of the acidic residues Asp and 
Glu, substitution between the amide residues Asn and Gin, 
exchange of the basic residues Lys and Arg and replacements 
among the aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain the same 
biological function and activity as the reference polypeptide 
from which it varies. 
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Fragments or portions of the enzymes of the present 
invention may be employed for producing the corresponding 
full-length enzyme by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the 
full-length enzymes. Fragments or portions of the 

polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present 
invention. 

The present invention also relates to vectors which 
include polynucleotides of the present invention, host cells 
which are genetically engineered with vectors of the 
invention and the production of enzymes of the invention by 
recombinant techniques . 

Host cells are genetically engineered (transduced or 
transformed or transfected) with the vectors of this 
invention which may be, for example, a cloning vector or an 
expression vector. The vector may be, for example, in the 
form of a plasmid, a viral particle, a phage, etc. The 
engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating 
promoters, selecting transf onnants or amplifying the genes of 
the present invention. The culture conditions, such as 
temperature, pH and the like, are those previously used with 
the host cell selected for expression, and will be apparent 
to the ordinarily skilled artisan. 

The polynucleotides of the present invention may be 
employed for producing enzymes by recombinant techniques. 
Thus, for example, the polynucleotide may be included in any 
one of a variety of expression vectors for expressing an 
enzyme . Such vectors include chromosomal , nonchromosotnal and 
synthetic DNA sequences, e.g., derivatives of SV40; bacterial 
plasmids; phage DNA; baculovirus; yeast plaamids ; vectors 
derived from combinations of plasmids and phage DNA,. viral 
DNA such as vaccinia, adenovirus, fowl pox virus, and 
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pseudorabies . However, any other vector may be used as long 
as it is replicable and viable in the host. 

The appropriate DNA sequence may be inserted into the 
vector by a variety of procedures. In general, the DNA 
sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such 
procedures and others are deemed to be within the scope of 
those skilled in the art - 

The DNA sequence in the expression vector is operatively 
linked to an appropriate expression control sequence (s) 
(promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or 
SV4 0 promoter, the E. coli . lac or trp , the phage lambda P L 
promoter and other promoters known to control expression of 
genes in prokaryotic or eukaryotic cells or their viruses. 
The expression vector also contains a ribosome binding site 
for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for 
amplifying expression. 

In addition, the expression vectors preferably contain 
one or more selectable marker genes to provide a phenotypic 
trait for selection of transformed host cells such as 
dihydrof olate reductase or neomycin resistance for eukaryotic 
cell culture, or such as tetracycline or ampicillin 
resistance in E. coli . 

The vector containing the appropriate DNA sequence as 
hereinabove described, as well as an appropriate promoter or 
control sequence, may be employed to transform an appropriate 
host to permit the host to express the protein. 

As representative examples of appropriate hosts, there 
may be mentioned: bacterial cells, such as E. coli , 
Streptomvces , Bacillus subtilis : fungal cells, such as yeast; 
insect cells such as Drosoohila S2 and S podoptera Sf 9 ; animal 
cells such as CHO, COS or Bowes melanoma; adenoviruses; plant 
cells, etc. The selection of an appropriate host is deemed 
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to be within the scope of thoae skilled in the art from the 
teachings herein. 

More particularly, the present invention also includes 
recombinant constructs comprising one or more of the 
sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, into 
which a sequence of the invention has been inserted, in a 
forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory 
sequences, including, for example, a promoter, operably 
linked to the sequence. Large numbers of suitable vectors 
and promoters are known to those of skill in the art, and are 
commercially available. The following vectors are provided 
by way of example; Bacterial: pQE70, pQE60, pQE-9 (Qiagen) , 
pDIO, psiX174, pBluescript II KS, pNH8A, pNH16a, pNH18A, 
pNH46A (Stratagene) ; ptrc99a, pKK223-3, pKK233-3, pDR540, 
pRITS (Pharmacia); Eukaryotic: pSV2CAT, pOG44 , pXTl, pSG 
(Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia) . However, 
any other plasmid or vector may be used as long as they are 
replicable and viable in the host. 

Promoter regions can be selected from any desired gene 
using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7 . Particular named bacterial promoters 
include lad, lacZ, T3 , T7, gpt , lambda P R/ P L and trp. 
Eukaryotic promoters include CMV immediate early, HSV 
thymidine kinase, early and late SV40, LTRs from retrovirus, 
and mouse metallothionein-I . Selection of the appropriate 
vector and promoter is well within the level of ordinary 
skill in the art. 

In a further embodiment, the present invention relates 
to host cells containing the above -described constructs. The 
host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukar-yotic cell, such as a yeast 
cell, or the host cell can be a prokaryotic cell, such as a 
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bacterial cell. Introduction of the construct into the host 
cell can be effected by calcium phosphate transf ect ion , DEAE- 
Dextran mediated transf ection, or electroporation (Davis, L. t 
Dibner, M . , Battey, I., Basic Methods in Molecular Biology, 
(1986) ) . 

The constructs in host cells can be used in a 
conventional manner to produce th£ gene product encoded by 
the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional 
peptide synthesizers . 

Mature proteins can be expressed in mammalian cells, 
yeast, bacteria, or other cells under the control of 
appropriate promoters. Cell-free translation systems can 
also be employed to produce such proteins using RNAs derived 
from the DNA constructs of the present invention. 
Appropriate cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by Sambrook, 
et al., Molecular Cloning: A T laboratory Manual, Second 
Edition, Cold Spring Harbor, N.Y., (1989), the disclosure of 
which is hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of the 
present invention by higher eukaryotes is increased by 
inserting an enhancer sequence into the vector. Enhancers 
are cis-acting elements of DNA, usually about from 10 to 300 
bp that act on a promoter to increase its transcription. 
Examples include the SV40 enhancer on the late side of the 
replication origin bp 100 to 270, a cytomegalovirus early 
promoter enhancer, the polyoma enhancer on the late side of 
the replication origin, and adenovirus enhajicers . 

Generally, recombinant expression vectors will include 
origins of replication and selectable markers permitting 
transformation of the host cell, e.g., the ampicillin 
resistance gene of E. coli and S. cerevisiae TRP1 gene, and 
a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such 
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promoters can be derived from operons encoding glycolytic 
enzymes such as 3 -phosphogiycerate kinase (PGK), a-factor, 
acid phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence i6 assembled in appropriate 
phase with translation initiation and termination sequences, 
and preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, the heterologous 
sequence can encode a fusion enzyme including an N- terminal 
identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed 
recombinant product - 

Useful expression vectors for bacterial use are 
constructed by inserting a structural DNA sequence encoding 
a desired protein together with suitable translation 
initiation and termination signals in operable reading phase 
with a functional promoter. The vector will comprise one or 
more ptienotypic selectable markers and an origin of 
replication to ensure maintenance of the vector and to, if 
desirable, provide amplification within the host. Suitable 

prokaryotic hosts for transformation include |L sali, 

Bacillus eubtilis . Salmonella tvphimurium and various species 
within the genera Pseudomonas, Streptomyces, and 
Staphylococcus, although others may also be employed as a 

matter of choice. 

As a representative but nonlimiting example, useful 
expression vectors for bacterial use can comprise a 
selectable marker and bacterial origin of replication derived 
from commercially available plasmids comprising genetic 
elements of the well known cloning vector pBR322 (ATCC 
3 7017) . Such commercial vectors include, for example, 
pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM1 
(Promega Biotec, Madison, WI, USA). These pBR322 "backbone" 
sections are combined with an appropriate promoter and the 
structural sequence to be expressed. 
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Following transformation of a suitable host strain and 
growth of the host strain to an appropriate cell density, the 
selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are 
cultured for an additional period. 

Cells are typically harvested by centrif ugation, 
disrupted by physical or chemical means, and the resulting 
crude extract retained for further purification. 

Microbial cells employed in expression of proteins can 
be disrupted by any convenient method, including freeze -thaw 
cycling, sonication, mechanical disruption, or use of cell 
lysing agents, such methods are well known to those skilled 
in the art . 

Various mammalian cell culture systems can also be 
employed to express recombinant protein. Examples of 
mammalian expression systems include the COS-7 lines of 
monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 
(1981) , and other cell lines capable of expressing a 
compatible vector, for example, the C127, 3T3 , CHO, HeLa and 
BHK cell lines . Mammalian expression vectors will comprise 
an origin of replication, a suitable promoter and enhancer, 
and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, 
transcriptional termination sequences, and 5' flanking 
nontranscribed sequences . DNA sequences derived from the 
SV4 0 splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements . 

The enzyme can be recovered and purified from 
recombinant cell cultures by methods including ammonium 
sulfate or ethanol precipitation, acid extraction, anion or 
cation exchange chromatography, phosphocellulose 
chromatography, hydrophobic interaction chromatography , 
affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Protein refolding steps can be used, 
as necessary, in completing configuration of the mature 
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protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps . 

The enzymes of the present invention may be a naturally 
purified product, or a product of chemical synthetic 
procedures , or produced by recombinant techniques from a 
prokaryotic or eukaryotic host (for example, by bacterial, 
yeast, higher plant, insect and mammalian cells in culture) . 
Depending upon the host employed in a recombinant production 
procedure, the enzymes of the present invention may be 
glycosylated or may be non-glycosylated* Enzymes of the 
invention may or may not also include an initial methionine 
amino acid residue. 

/3-galactosidase hydrolyzes lactose to galactose and 
glucose. Accordingly, the OC1/4V, 9N2-31B/G, AEDII12RA-18B/G 
and F1-12G enzymes may be employed in the food processing 
industry for the production of low lactose content milk and 
for the production of galactose or glucose from lactose 
contained in whey obtained in a large amount as a by-product 
in the production of cheese. Generally, it is desired that 
enzymes used in food processing, such as the aforementioned 
0-galactosidases, be stable at elevated temperatures to help 
prevent microbial contamination. 

These enzymes may also be employed in the pharmaceutical 
industry. The enzymes are used to treat intolerance to 
lactose. In this case, a thermostable enzyme is desired, as 
well. Thermostable 0-galactosidases also have uses in 
diagnostic applications, where they are employed as reporter 
molecules . 

Glucosidases act on soluble cellooligosaccharides from 
the non-reducing end to give glucose as the sole product. 
Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non- reducing ends (endo- 
glucanases, for instance, act on internal linkages yielding 
cellobiose, glucose and cellooligosaccharides as products). 
0 -glucosidases are used in applications where glucose is the 
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desired product. Accordingly, M11TL, F1-12G, GC74-22G and 
MSB8-6G (and OC1/4V, VC1-7G1, 9N2-31B/G and AEDII12RA18B/G) 
may be employed in a wide variety of industrial applications, 
including in corn wet milling for the separation of starch 
and gluten, in the fruit industry for clarification and 
equipment maintenance, in baking for viscosity reduction, in 
the textile industry for the processing of blue jeans, and in 
the detergent industry as an additive. For these and other 
applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding 
to a sequence of the present invention can be obtained by 
direct injection of the enzymes into an animal or by 
administering the enzymes to an animal , preferably a 
nonhuman. The antibody so obtained will then bind the 
enzymes itself . In this manner , even a sequence encoding 
only a fragment of the enzymes can be used to generate 
antibodies binding the whole native enzymes. Such antibodies 
can then be used to isolate the enzyme from cells expressing 
that enzyme. 

For preparation of monoclonal antibodies , any technique 
which provides antibodies produced by continuous cell line 
cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497) , 
the trioma technique, the human B-cell hybridoma technique 
(Kozbor et al., 1983, Immunology Today 4:72), and the EBV- 
hybridoma technique to produce human monoclonal antibodies 
(Cole, et al., 1985, in Monoclonal Antibodies and Cancer 
Therapy, Alan R. LiS6, Inc., pp. 77-96). 

Techniques described for the production of single chain 
antibodies (U.S. Patent 4,946,778) can be adapted to produce 
single chain antibodies to immunogenic enzyme products of 
this invention. Also, transgenic mice may be used to express 
humanized antibodies to immunogenic enzyme products of this 
invention . 
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Antibodies generated against the enzyme of the present 
invention may be used in screening for similar enzymes from 
other organisms and samples . Such screening techniques are 
known in the art, for example, one such screening assay is 
described in "Methods for Measuring Cellulase Activities", 
Methods in enzymology , Vol 160, pp. 87-116, which is hereby 
incorporated by reference in its entirety. 

The present invention will be further described with 
reference to the following examples; however, it is to be 
understood that the present invention is not limited to such 
examples. All parts or amounts, unless otherwise specified, 
are by weight . 

In order to facilitate understanding of the following 
examples certain frequently occurring methods and/or terms 
will be described. 

"Plasmids" are designated by a lower case p preceded 
and/or followed by capital letters and/or numbers. The 
starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be 
constructed from available plasmids in accord with published 
procedures. In addition, equivalent plasmids to those 
described are known in the art and will be apparent to the 
ordinarily skilled artisan. 

"Digestion- of DNA refers to catalytic cleavage of the 
DNA with a restriction enzyme that acts only at certain 
sequences in the DMA. The various restriction enzymes used 
herein are commercially available and their reaction 
conditions, cof actors and other requirements were used as 
would be known to the ordinarily skilled artisan. For 
analytical purposes, typically 1 fig of plasmid or DNA 
fragment is used with about 2 units of enzyme in about 20 til 
of buffer solution. For the purpose of isolating DNA 
fragments for plasmid construction, typically 5 to 50 fig of 
DNA are digested with 20 to 250 units of enzyme in a larger 
volume. Appropriate buffers and substrate amounts for 
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particular restriction enzymes are specified by the 
manufacturer. Incubation times of about 1 hour at 37 *C are 
ordinarily used, but may vary in accordance with the 
supplier's instructions. After digestion the reaction is 
elect rophoresed directly on a polyacrylamide gel to isolate 
the desired fragment. 

Size separation of the cleaved fragments is performed 
using 8 percent polyacrylamide gel described by Goeddel, D. 
et al., Nucleic Acids Res . , 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded 
polydeoxynucleotide or two complementary polydeoxynucleotide 
strands which may be chemically synthesized. Such synthetic 
oligonucleotides have no 5' phosphate and thus will not 
ligate to another oligonucleotide without adding a phosphate 
with an ATP in the presence of a kinase. A synthetic 
oligonucleotide will ligate to a fragment that has . not been 
dephosphorylated . 

"Ligation" refers to the process of forming 
phosphodiester bonds between two double stranded nucleic acid 
fragments (Maniatis, T\ , et al., Id. , p. 146). Unless 
otherwise provided, ligation may be accomplished using known 
buffers and conditions with 10 units of T4 DNA ligase 
("ligase") per 0.5 fxg of approximately equimolar amounts of 
the DNA fragments to be ligated. 

Unless otherwise stated, transformation was performed as 
described in the method of Graham, F. and Van der Eb, A., 
Virology, 52:456-457 (1973) . 

Example 1 

Bacterial Expression and Purification of Glvcosidase Enzymes 
DNA encoding the enzymes of the present invention, SEQ 
ID N0S:1 through 8, were initially amplified from a 
pBluescript vector containing the DNA by the PCR technique 
using the primers noted herein. The amplified sequences were 
then inserted into the respective PQE vector listed beneath 
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Che primer sequences, and the enzyme was expressed according 
to the protocols set forth herein. The 5' and 3 ' primer 
sequences for the respective genes are as follows: 

Thermococcue AEDII12RA -18B/G 

5 ' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGGTGAATGCTATGATTGTC 
(SEQ ID NO:29) 

3' CGGAAGATCTTCATAGCTCCGGAAGCCCATA (SEQ ID NO:30) 

Vector: PQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Big II. 

OC1/4V-33B/G 

5 < C CGAGAATT CATT AAAGAGGAGAAATT AACT ATGATAAGAAGGT C C GATTTT C C 

(SEQ ID NO: 31) 

3' CGGAAGATCTTTAAGATTTTAGAAATTCCTT (SEQ ID NO: 32) 

Vector: OQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Bgl II. 



: CGAGAATT CATT AAAGAGGAGAAATTAACT ATG CTAC CAGAAGG CTTT CT C 



Thermococcue 9N2 - 31B/G 
5' C( 
(SEQ ID NO:33) 

3' CGGAGGTACCTCACCCAAGTCCGAACTTCTC (SEQ ID NO: 34) 

Vector: P QE30; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Kpnl . 

Staphylothennus marinua Fl - 12G 

5 - CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGGTTTCCTGATTAT 
(SEQ ID NO: 35) 

3' CGGAAGATCTTTATTCGAGGTTCTTTAATCC (SEQ ID NO:36) 

Vector: PQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Bgl II. 

Thermococcus chi tonophague GC74 - 22G ,_,_„,,,,„-,- 
5' CCGAGAATTCATTCZATT AAAGAGGAGAAATT AACTATGCI^CCAGG^ 

(SEQ ID NO: 37) 
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3' CGGAGGATCCCTACCCCTCCTCTAAGATCTC (SEQ ID NO: 38) 

vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' BamHI . 

M11TL 

5 ' AATAATCTAGAGCATGCAATTCCCCAAAGACTTCATGATAG (SEQ ID NO : 3 9 ) 
3' AATAAAAGCTTACTGGATCAGTGTAAGATGCT (SEQ ID NO: 40) 
Vector: pQE70; and contains the following restriction enzyme 
sites 5' SphI and 3' Hind III. 

Thermotoga maritime MSB8-6G 

5 ' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 
( SEQ ID NO: 41) 

3 ' CGGAGGTACCTCATGGTTTGAATCTCTTCTC ( SEQ ID NO : 4 2 ) 

Vector: pQE12 ; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Kpnl . 

PyrococcuB furioeua VC1 - 7G1 

5 ' CCGACAATTGATTAAAGAGGAGAAATTAACTATGTTCCCTGAAAAGTTCCTT 
(SEQ ID NO: 43) 

3' CGGAGGTACCTCATCCCCTCAGCAATTCCTC (SEQ ID NO: 44) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Kpn I. 

Bankia gouldi endoglucanase (37GP1) 

5 ' AATAAGGATCCGTTTAGCGACGCTCGC 
(SEQ ID NO:45) 

3' AATAAAAGCTTCCGGGTTGTACAGCGGTAATAGGC (SEQ ID NO: 46) 

Vector: pQES2; and contains the following restriction enzyme 
sites 5' Bam HI and 3' Hind III. 

rhexmotogra maritime a-galactosidaee (6GC2) 

5 ' TTTATTGAATTCATTAAAGAGGAGAAA^ 
(SEQ ID NO:47) 

3 ' TCTATAAAGCTTTCATTCTCTCTCACCCTCTTCGTAGAAG (SEQ ID NO : 4 8 ) 
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Vector: pQET; and contains the following restriction enzyme 
sites 5' EcoRI and 3' Hind III. 

Therraotoga maritima S-mannanase (6GP2) 

5 ' TTTATTCAATTGATTAAAGAGGAGAAATTAACTATGGGGATTGGTGGCGACGAC 
(SEQ ID NO: 49) 

3' TTT ATT AAG CTT ATCTTTT CAT ATT CACAT ACCT C C (SEQ ID NO: 50) 
Vector: pQEt; and contains the following restriction enzyme 
sites 5' Hind III and 3' EcoRI. 
AEPII la S-mannanase (63GB1) 

5 - TTT ATTGAATT CATT AAAGAGGAGAAATT AACT ATG CT AC CAGAAGAGTT C CT ATGGGG C 
(SEQ ID NO: 51) 

3' TTTATTAAGCTTCTCATCAACGGCTATGGTCTTCATTTC (SEQ ID NO: 52) 

Vector: pQEt ; and contains the following restriction enzyme 
sites 5' Hind III and 3' EcoRI. 

0C1/4V endoglucanase (33GP1) 

5 < aa^cAATT^TTCATTAAA^ 
(SEQ ID NO: 53) 

3' TTTTTCC^TCCAATTCTTCATTTACTC^TTGCCTG (SEQ ID NO : 54 ) 

Vector: pQEt; and contains the following restriction enzyme 
sites 5' BamHI and 3' EcoRI . 

Thermotoga maritima pullalanase (6GP3) 

5 • TTTTGGAATTCATTAAAGAGGAGAAATTAACTATGGAACTGATCATAGAAGGTTAC 
(SEQ ID NO: 55) 

2' ATAAGAAGCTTTTCACTCTCTGTACAGAACGTACGC (SEQ ID N0:56) 

Vector: pQEt; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Hind III. 

The restriction enzyme sites indicated correspond to the 
restriction enzyme sites on the bacterial expression vector 
indicated for the respective gene (Qiagen, Inc. Chatsworth. 
CA) The pQE vector encodes antibiotic resistance (Amp') , a 
bacterial origin of replication (ori) . an iPTG-regulatable 
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promoter operator (P/O) . a riboeome binding site ( RBS ) a 6- 
His tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes 
indicated. The amplified sequences were ligated into the 
respective pQE vector and inserted in frame with the sequence 
encoding for the RBS. The ligation mixture was then used to 
transform the e. coli strain M15/pREP4 (Qiagen, Inc.) by 
electroporation. MlS/pREP4 contains multiple copies of the 
plasmid pREP4, which expresses the lad repressor and also 
confers kanamycin resistance (Kan'). Transf ormants were 
identified by their ability to grow on LB plates and 
ampicillin/kanamycin resistant colonies were selected. 
Plasmid DNA was isolated and confirmed by restriction 
analysis. Clones containing the desired constructs were 
grown overnight (O/H) in liquid culture in LB media 
supplemented with both Amp (100 ug/ml) and Kan (25 ug/ml) . 
The 0/N culture was used to inoculate a large culture at a 
ratio of 1:100 to 1:250. The cells were grown to an optical 
density 600 (O.D.«) of between 0.4 and 0.6. IPTG 
("Isopropyl-B-D-thiogalacto pyranoside" ) was then added to a 
final concentration of 1 mM. IPTG induces by inactivating 
the lad repressor, clearing the P/0 leading to increased 
gene expression. Cells were grown an extra 3 to 4 hours. 
Cells were then harvested by centrif ugation . 

The primer sequences set out above may also be employed 
to isolate the target gene from the deposited material by 
hybridization techniques described above. 



Example 2 

Tci.rion * ' EuOtcf rt ri one* From the Deposited genomic 

clones 

A clone is isolated directly by screening the 
deposited material using the oligonucleotide primers set 
forth in Example 1 for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesized 
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using an Applied Biosystems DNA synthesizer. The 
oligonucleotides are labeled with 12 P- -ATP using T4 
polynucleotide kinase and purified according to a standard 
protocol (ManiatiB et al . , Molecular Cloning: A Laboratory 
Manual, Cold Spring Harbor Press, Cold Spring, NY, 1982). 
The deposited clones in the pBluescript vectors may be 
employed to transform bacterial hosts which are then plated 
on 1.5V agar plates to the density of 20,000-50,000 
pfu/150 mm plate. These plates are screened using Nylon 
membranes according to the standard screening protocol 
(Stratagene, 1993). Specifically, the Nylon membrane with 
denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaH,P0 4 , 0.4%SDS, 5 x Denhardt's 500 ng/ml denatured, 
sonicated salmon sperm DNA; and 6 x SSC, 0.1% SDS . After 
one hour of prehybridization, the membrane is hybridized 
with hybridization buffer 6xSSC, 20 mM NaH 2 P0 4 , 0.4%SDS, 500 
ug/ml denatured, sonicated salmon sperm DNA with lxlO 6 
cpm/ml "P-probe overnight at 42»C. The membrane is washed 
at 45-50<»C with washing buffer 6 x SSC, 0.1% SDS for 20-30 
minutes dried and exposed to Kodak X-ray film overnight. 
Positive clones are isolated and purified by secondary and 
tertiary screening. The purified clone is sequenced to 
verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide 
primers corresponding to the gene of interest are used to 
amplify the gene from the deposited material. A polymerase 
chain reaction is carried out in 25 nl of reaction mixture 
with 0.5 ug of the DNA of the gene of interest. The 
reaction mixture is 1.5-5 mM MgCl 2 , 0.01% <w/v) gelatin, 20 
M M each of dATP , dCTP, dGTP, dTTP , 25 pmol of each primer 
and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR 
(denaturation at 94 °C for 1 min; annealing at 55«C for 1 
min; elongation at 72-C for 1 min) are performed with the 
Perkin-Elmer Cetus automated thermal cycler. The amplified 
product is analyzed by agarose gel electrophoresis and the 
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DNA band with expected molecular weight is excised and 
purified. The PCR product is verified to be the gene of 
interest by subcloning and sequencing the DNA product . The 
ends of the newly purified genes are nucleotide sequenced 
to identify full length sequences. Complete sequencing of 
full length genes is then performed by Exonuc lease III 
digestion or primer walking. 

Example 3 

Screening for Galactosidase Acti vity 
Screening procedures for or-galactosidase protein 
activity may be assayed for as follows: 

Substrate plates were provided by a standard plating 
procedure. Dilute XLl-Blue MRF E coli host of (Stratagene 
Cloning Systems, La Jolla, CA) to O.D. w « 1.0 with NZY 
media. In 15 ml tubes, inoculate 200 fil diluted host cells 
with phage. Mix gently and incubate tubes at 37 °C for 15 
min. Add approximately 3 . 5 ml LB top agarose (0.7%) 
containing ImM IPTG to each tube and pour onto all NYZ 
plate surface. Allow to cool and incubate at 37 °C 
overnight. The assay plates are obtained as substrate p- 
Nitrophenyl a -galactosidase (Sigma) (200 mg/100 ml) (100 mM 
NaCl, 100 mM Potassium- Phosphate) IV (w/v) agarose. The 
plaques are overlayed with nitrocellulose and incubated at 
4 °C for 30 minutes whereupon the nitrocellulose is removed 
and overlayed onto the substrate plates. The substrate 
plates are then incubated at 70 °C for 20 minutes. 

Screening of Clones for M» rm»™»a« Activity 
A solid phase screening assay was utilized as a 
primary screening method to test clones for S-mannanase 
activity. 

A culture solution of the Y1090-E. coli host strain 
(Stratagene Cloning Systems, La Jolla, CA) was diluted to 



-39- 



WO 97/25417 PCT/US97/00092 

O.D.mo-1.0 with NZY media. The amplified library from 
Thermotoga maritima. lambda gtll library was diluted in SM 
(phage dilution buffer): 5 x 10 7 pfu//il diluted 1:1000 
then 1:100 to 5 x 10 2 pfu//xl- Then 8 pi of phage dilution 
(5 x 10 J pfu/jxD was plated in 200 /xl host cells. They 
were then incubated in 15 ml tubes at 37 °C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at 
approximately 52 °C was added to each tube and the mixture 
was poured onto the surface of LB agar plates. The agar 
plates were then incubated at 37 »C for five hours. The 
plates were replicated and induced with 10 mM IPTG-soaked 
Duralon-UV™ nylon membranes (Stratagene Cloning Systems, 
La Jolla, CA> overnight. The nylon membranes and plates 
were marked with a needle to keep their orientation and the 
nylon membranes were then removed and stored at 4 »C. 

An Azo-galactoraannan overlay was applied to the LB 
plates containing the lambda plaques . The overlay contains 
1% agarose, 50 roM potassium-phosphate buffer pH 7 , 0.4% 
Azocarob-galactomannan. (Megazyme, Australia). The plates 
were incubated at 72 °C. The Azocarob-galactomannan 
treated plates were observed after 4 hours then returned to 
incubation overnight. Putative positives were identified 
by clearing zones on the Azocarob-galactomannan plates. 
Two positive clones were observed. 

The nylon membranes referred to above, which 
correspond to the positive clones were retrieved, oriented 
over the plate and the portions matching the locations of 
the clearing zones for positive clones wre cut out. Phage 
was eluted from the membrane cut-out portions by soaking 
the individual portions in 500 M l SM (phage dilution 
buffer) and 25 /il CHC1 3 . 



flrn^iinina clones for MwrniofH Activity 
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A solid phase screening assay was utilized as a 
primary screening method to test clones for fi-mannosidase 
activity . 

A culture solution of the Y1090-E. coli host strain 
(Stratagene Cloning Systems, La Jolla, CA) was diluted to 
O.D. 600=1 . 0 with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage 
dilution buffer): 5 x 10 ? pfu/Atl diluted 1:1000 then 1:100 
to 5 x 10 2 pfu/^cl- Then 8 fil of phage dilution 
(5 x 10 2 pfu//il) was plated in 200 jxl host cells. They 
were then incubated in 15 ml tubes at 37 °C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at 
approximately 52 °C was added to each tube and the mixture 
was poured onto the surface of LB agar plates . The agar 
plates were then incubated at 37 °C for five hours. The 
plates were replicated and induced with 10 mM IPTG- soaked 
Duralon-UV™ nylon membranes (Stratagene Cloning Systems, 
La Jolla, CA) overnight. The nylon membranes and plates 
were marked with a needle to keep their orientation and the 
nylon membranes were then removed and stored at 4 °C. 

A p-nitrophenyl-fc-D-raanno-pyranoside overlay was 
applied to the LB plates containing the lambda plaques . 
The overlay contains 1% agarose, 50 mM potassium-phosphate 
buffer pH 7, 0.4% p-nitrophenyl-S-D-manno-pyranoside . 
(Megazyme, Australia) . The plates were incubated at 72 °C. 
The p-nitrophenyl-S-D-manno-pyranoside treated plates were 
observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing 
zones on the p-nitrophenyl-S-D-manno-pyranoside plates. 
Two positive clones were observed. 

The nylon membranes referred to above, which 
correspond to the positive clones were retrieved, oriented 
over the plate and the portions matching the locations of 
the clearing zones for positive clones wre cut out. Phage 
was eluted from the membrane cut-out portions by soaking 



-41- 



WO 97/25417 PCT/US97/00092 



the individual portions in 500 /xl SM (phage dilution 
buffer) and 25 ^1 CHC1 3 . 

Example 6 
Screening for Pullulanaee Activity 

Screening procedures for pullulanaee protein activity 
may be assayed for as follows: 

Substrate plates were provided by a standard plating 
procedure. Host cells are diluted to O.D. m = 1.0 with N2Y 
or appropriate media. In 15 ml tubes, inoculate 200 jil 
diluted host cells with phage. Mix gently and incubate 
tubes at 37 °C for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) is added to each tube and the mixture is 
plated, allowed to cool, and incubated at 37°C for about 28 
hours. Overlays of 4.5 mis of the following substrate are 
poured : 

100 ml total volume 



O.Sg Red Pullulan Red (Megazyme, Australia) 

1 . 0g Agarose 

5ml Buffer (Tris-HCL pH 7.2 ® 75 °C) 

2ml 5M NaCl 

5ml CaCl 2 (lOOmM) 

85ml dH 2 0 



Plates are cooled at room temperature, and thenm incubated 
at 75°C for 2 hours. Positives are observed as showing 
substrate degradation. 

Example 7 

screening for Endoalucanase Activity 
Screening procedures for endoglucanase protein 

activity may be assayed for as follows: 

1. The gene library is plated onto 6 LB/GelRite/0 . 1% 

CMC/NZY agar plates (-4,800 plaque forming units/plate) in 

E.coli host with LB agarose as top agarose. The plates are 

incubated at 37 °C overnight. 
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2. Plates are chilled at 4°C for one hour. 

3 . The plates are overlayed with Duralon membranes 
(Stratagene) at room temperature for one hour and the 
membranes are oriented and lifted off the plates and stored 
at 4°C. 

4 . The top agarose layer is removed and plates are 
incubated at 37°C for -3 hours. 

5. The plate surface is rinsed with NaCl. 

6. The plate is stained with 0.1% Congo Red for 15 
minutes . 

7. The plate is destained with 1M NaCl . 

8. The putative positives identified on plate are 
isolated from the Duralon membrane (positives are 
identified by clearing zones around clones) . The phage is 
eluted from the membrane by incubating in 500/il SM + 25m1 

CHC1 3 to elute. 

9 . Insert DNA is subcloned into any appropriate 
cloning vector and subclones are reaseayed for CMCase 
activity using the following protocol: 

i) Spin lml overnight miniprep of clone at 
maximum speed for 3 minutes. 

ii) Decant the supernatant and use it to fill 
"wells" that have been made in an LB/GelRite/O . 1% CMC 
plate . 

iii) Incubate at 37°C for 2 hours. 

iv) Stain with 0.1V Congo Red for 15 minutes. 

v) Destain with 1M NaCl for 15 minutes . 

vi) Identify positives by clearing zone around 

clone . 

Numerous modifications and variations of the present 
invention are possible in light of the above teachings and, 
therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly 
described . 
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WHAT IS CLAIMED IS : 

1. An isolated polynucleotide comprising a member 
selected from the group consisting of: 

(a) a polynucleotide having at least a 70% 
identity to a polynucleotide encoding an enzyme comprising 
amino acid sequences set forth in SEQ ID NOS : 15-28; 

(b) a polynucleotide which is complementary to 
the polynucleotide of (a) ; and 

(c) a polynucleotide comprising at least 15 
bases of the polynucleotide of (a) or (b) . 

2 . The polynucleotide of Claim 1 wherein the 
polynucleotide is DNA. 

3 . The polynucleotide of Claim 1 wherein the 
polynucleotide is RNA. 

4 . The polynucleotide of Claim 2 which encodes an 
enzyme comprising an amino acid sequence which a member 
selected from the group 
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according 


to 


SEQ 


ID 


NO: 


16; 


(c) 


according 


to 


SEQ 


ID 


NO: 


17; 


(d) 


according 


to 


SEQ 


ID 


NO: 


18; 


(e) 


according 


to 


SEQ 


ID 


NO: 


19; 


(f) 


according 


to 


SEQ 


ID 


NO: 


20; 
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according 


to 


SEQ 


ID 


NO: 


21; 


(h) 


according 


to 


SEQ 


ID 


NO: 


22; 


(i) 


according 


to 


SEQ 


ID 


NO: 


23; 


(j) 


according 


to 


SEQ 


ID 


NO: 


24; 


(k) 


according 


to 


SEQ 


ID 


NO: 


25; 
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according 


to 


SEQ 


ID 


NO: 


26; 


(m) 


according 


to 
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ID 


NO: 


27; 


(n) 


according 


to 


SEQ 


ID 


NO: 


28 . 
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5 An isolated polynucleotide comprising a member 
selected from the group consisting of: 

(a) a polynucleotide having at least a 70% 
identity to a polynucleotide encoding an enzyme encoded by 
the DNA contained in ATCC Deposit No. 97379, wherein said 
enzyme is selected from the group consisting of M11TL, 
OC1/4V, F1-12G, 9N2-31B/G, MSB8-6G, AEDII12RA- 18B/G , GC74- 

22G and VC1-7G1; 

(b) a polynucleotide complementary to the 

polynucleotide of (a) ; and 

(c) a polynucleotide comprising at least 15 
bases of the polynucleotide of (a) and (b) . 

6 a vector comprising the DNA of Claim 2. 

7 # a host cell comprising the vector of Claim 6 . 

8< a process for producing a polypeptide comprising: 

expressing from the host cell of Claim 7 a polypeptide 
encoded by said DNA. 

9 A process for producing a cell comprising: 

transforming or transfecting the cell with the vector of 
Claim 6 such that the cell expresses the polypeptide 
encoded by the DNA contained in the vector. 

iO, An enzyme comprising a member selected from the 

group consisting of : 

(a) an enzyme comprising an amino acid sequence 
which is at least 70% identical to the amino acid sequence 
set forth in SEQ ID NOS: 15-28; and 

(b) an enzyme which comprises at least 3 0 amino 
acid residues to the enzyme of (a) . 
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11 . A method for generating glucose from soluble 

cellooligosaccharides comprising : 

administering an effective amount of an enyzme 
selected from the group consisting of an enzyme having the 
amino acid sequence set forth in SEQ ID NOS:15-28. 
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M11TL G LYCOS I DASE - 2 9G 
COMPLETE GENE SEQUENCE - 9/95 



AAA i'T. ..i* A/, A f.Ai m Alt 



i v ft,., r, . , i v ■ a*. i. rt». 



n a i-i-t. rrv i a a 

' i • I I'l.i Pli.- '.In 
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: I V ' ' <• »Tf- Civ •< -r C. I ii A:-n I'p . . A i 



.at ci i . a at a> r ,.A'i ret; rcc i;ta TCc c;\ . , A 'i cai 



Asp Ti p Tr p V,i I Tr p V* I it 



■I AAC ACA CCA CrT CCA fTA CTC ACC cc< CAT ..- rr , ;A , . AA ,. ^ ^ ( ;| ;T rAf . ( . f . ( ; ^ 

; Ar.r, Tin A AU C, y V.i i ■„ • , ,l V A.;p ,■„.. i-.n Glu ASH Gly ho Wly Tv , A . ;n 



J A! TPA AAC CAA AAT CAC CAC CAC CTC CCT GAG AAG CTC CCC CTT AAC ACT ATT ACA CTA CC 0 

(>1 !..-„ Asn r.ia ash Asp His Asp um. am ci u Lys Leu Cly Val Asn Tin lie Arg Val Gly 8u 

24 1 CTT GAG TCC ACT ACG ATT TTT CCA AAC CCA ACT TTC AAT CTT AAA CTC CCT CTA GAG AGA ■ J 00 

HI Val Clu Trp Ser Arg He Phe Pro tys ho Thr Phe Asn val Lys Val Pro Val Clu Arq l0 0 



360 
p 1 20 



J0 1 CAT GAG AAC CCC ACC ATT CTT CAC CTA CAT GTC CAT GAT AAA GCG CTT CAA AGA CTT CAT 

101 asp Glu Asn Cly Ser lie Val His Val asp Val Asp Asp Lys Al* Val Glu Arg Leu As 

36 1 CAA TTA CCC AAC AAC GAG CCC CTA AAC CAT TAC CTA GAA ATG TAT AAA G AC TCC CTT CAA 4 20 

12. Clu Leu Ala Asn Lys Glu Ala Val Asn Mis Tyr Val Clu Met Tyr Lys Asp Trp Val Giu 140 

421 ACA CCT AGA AAA CTT ATA CTC AAT TTA TAC CAT TCC CCC CTC CCT CTC TCC CTT CAC AAC 4 80 

14 1 Arg Cly Arg Lys Leu lie Leu Asn Leu Tyr His Trp Pro Leu Pro Leu Trp Leu His Asn 160 

48 i CCA ATC ATG CTC AGA ACA ATG CCC CCC CAC AGA CCC CCC TCA CCC TCG CTT AAC GAG C.AC >4Q 

161 Pro He «et Val Arg Arg Met Cly Pro Asp Arg Ala Pro Ser Cly Trp Leu Asn Glu Glu 180 

54 1 TCC CTC GTG GAG TTT CCC AAA TAC CCC CCA TAC ATT CCT TCG AAA ATG GGC GAG CTA CCT 600 

181 Ser Val Val Clu Ph« Ala Lys Tyr Ala Ala Tyr He Ala Trp Lys Met Gly Glu Leu Pro 200 

601 CTT ATC TCG ACC ACC ATG AAC GAA CCC AAC CTC CTT TAT GAG CAA GGA TAC ATG TTC CTT 660 

201 Val Het Trp Ser Thr Met Asn Glu Pro Asn Val Val Tyr Glu Gin Gly Tyr Met Phe Val 220 

661 AAA GGG GGT TTC CCA CCC CCC TAC TTC ACT TTC GAA CCT CCT GAT AAC CCC AGG AGA AAT 720 

221 Lys Gly Gly Phe Pro Pro Gly Tyr Leu Ser Leu Glu Ala Ala Asp Lys Ala Arg Arg Asn 240 

72 1 ATG ATC CAC CCT CAT CCA CCG CCC TAT CAC AAT ATT AAA CGC TTC ACT AAG AAA CCT CTT 7 80 

24 1 Met lie Gin Ala His Ala Arg Ala Tyr Asp Asn lie Lys Arg Phe Ser Lys Lys Pro Val 260 

791 GGA CTA ATA TAC CCT TTC CAA TGG TTC CAA CTA TTA GAG GGT CCA CCA GAA CTA TTT GAT 840 

261 Cly Leu He Tyr Ala Phe Gin Trp Phe Clu Leu Leu Clu Gly Pro Ala Glu Val Phe Asp 280 

84 1 AAC TTT AAG AGC TCT AAC TTA TAC TAT TTC ACA CAC ATA CTA TCC AAC GGT ACT TCA ATC 900 

231 Lys Phe Lys Ser Ser Lys Leu Tyr Tyr Phe Thr Asp He Val Ser Lys Cly Ser Ser lie 300 



901 ATC AAT CTT GAA TAC AGG ACA GAT CTT CCC AAT ACG CTA CAC TCG TTG GGC CTT AAC TAC 

301 He Asn Val Glu Tyr Arg Arg Asp Leu Ala Asn Arg Leu Asp Trp Leu Gly' Val Asn Tyr 

961 TAT ACC CCT TTA GTC TAC AAA ATC CTC CAT CAC AAA CCT ATA ATC CTC CAC GCG TAT GGA 

321 Tyr Ser Arg Leu Val Tyr Lys He Val Asp Asp Lys Pro lie He Leu His Cly Tyr Cly 340 

1021 TTC CTT TCT ACA CCT CGC GCG ATC ACC CCC CCT GAA AAT CCT TCT AGC GAT TTT GGG TCC 

34 1 Phe Leu Cys Thr Pro Gly Gly He Ser Pro Ala Glu Asn Pro Cys Ser Asp Phe Gly Trp 

108 1 CAC CTC TAT CCT CAA CCA CTC TAC CTA CTT CTA AAA GAA CTT TAC AAC CCA TAC CGC CTA 

3G1 Clu Val Tyr Pro Clu Cly Leu Tyr Leu Leu Leu Lys Glu Lou Tyr Asn Arg Tyr CI 



114 1 CAC TrC ATC CTC ACC C;A<; AAC GGT CTT TCA CAC AGC' ACJ; CAT CCG TTG AGA CCC CCA TAC 
381 Asp Leu IU: val Tin On Asn Cly Val Ser Asp Ser Arq Asp Ala Leu Arg Pro Ala Tyi 



960 
320 



1020 



1080 
360 



: 140 
Y Vol 380 



I ?00 
4 00 



120} CTG CTC TCG CAT CTT TAC AGC CTA TCG AAA CCC CCT AAC CAC CjCC ATT CCC CTC AAA CCC UbU 

40] l*m. Vdl -:.. r ms v.il T V i ser v.i t n u Lys am aU a.,, ci„ cj v llu Pro V.i 1 i.y* ci v i.-n 

l .M.I TAC CTC CAC TCC ACC TTC ACA CA< AAT TAC < ;AC Tc( . i ,i i .Ac CCC TTr At U : CAC AAA TT< I 

T >" T *" ■«'•" A-:., I'yr ..h, T f) . AC, cl„ Cly (•»,.. A rg C 1 ,, |. y .. .,], 
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OCl/4 CLYC03IDABE - 330/B 
COMPLETE GENE SEQUENCE - 9/9 5 

l AT<; ATA AGA AGC TCC* CAT TTT f(A AAA CAT TTT ATC TTC GLiA ACC i :< *f At i, ,;,A CCA TAC <>0 

t mo» tie Arg Arg Ser Asp i'h~ - i. V s A.-:p i>he lie Phe Gly Thr M.i rhr M * am Tyr 20 

M i 'AG ATT CAA CCT CCA CCA AAf CAA ( iAT CCC ACA CCC CCA TCA ATT TCC CAT CTC TTT TCA 120 

"i f'.ln Tie Clu Cly Ala Ala Asn <;lu Asp Cly Arg Cly Pro Ser lie Tip Asp v^l Phe Ser 40 

!■>! <~AC ACC CCT CCC AAA ACC CTC AAC CfTF OAC ACA CCA CAC CTT CCC Tf *T f;AC CAT TAT CAC 180 

4 1 ii.s Thr Pro Cly Lys Thr Leu Asn Gly Asp Thr Cly Asp Val Ala Cys Asp Mis Tyr Mis 60 

I HI CCA TAC AAG CAA CAT ATC CAC CTC ATC AAA CAA ATA GGC TTA GAC CCT TAC ACC TTC TCT 24 0 

61 Arg Tyr Lys Clu Asp lie Gin Leu Met Lys Glu lie Gly Leu Asp Ala Tyr Arg Phe Ser 80 

24 1 ATC TCC TGG CCC AGA ATT ATC CCA CAT GGG AAG AAC ATC AAC CAA AAG CCT CTC CAT TTC 3 00 

81 He Ser Trp Pro Arg lie Met Pro Asp Gly Lys Asn lie Asn Gin Lys Gly Val Asp Phe 100 

301 TAC AAC AGA CTC CTT CAT GAG CTT TTC AAG AAT GAT ATC ATA CCA TTC CTA ACA CTC TAT 3 60 

101 Tyr Asn Axg Leu Vai Asp Glu Leu Leu Lys Asn Asp He He Pro Phe val Thr Leu Tyr 120 

361 CAC TGG GAC TTA CCC TAC CCA CTT TAT GAA AAA CCT GCA TGG CTT AAC CCA CAT ATA CCG 420 

121 His Trp Asp Leu Pro Tyr Ala Leu Tyr Glu Lys Cly Gly Trp Leu Asn Pro Asp lie Ala 140 

421 CTC TAT TTC AGA GCA TAC CCA ACC TTT ATC TTC AAC GAA CTC CCT GAT CCT CTC AAA CAT 480 

141 Leu Tyr Pfie Arg Ala Tyr Ala Thr Phe Met Phe Asn Clu Leu Gly Asp Arg val Lys Mxs 160 

48 1 TGG ATT ACA CTC AAC GAA CCA TGG TCT TCT TCT TTC TCC CCT TAT TAC ACG CCA GAG CAT 540 

161 Trp He Thr Leu Asn Glu Pro Trp Cys Ser Ser Phe Ser Gly Tyr Tyr Thr Cly Glu His 180 

541 CCC CCG CCT CAT CAA AAT TTA CAA GAA CCG ATA ATC CCG CCG CAC AAC CTC TTC AGG GAA 600 

181 Ala Pro Gly His Gin Asn Leu Gin Glu Ala He He Ala Ala His Asn Leu Leu Arg Glu 200 

601 CAT GCA CAT GCC CTC CAC CCG TCC AGA GAA GAA CTA AAA GAT GGG GAA CTT CCC TTA ACC 660 

201 His Cly His Ala Val Gin Ala Ser Arg Glu Clu Val Lye Asp Gly Glu Val Gly Leu Thr 220 

661 AAC CTT CTC ATG AAA ATA CAA CCG GGC CAT CCA AAA CCC GAA ACT TTC TTC CTC GCA ACT 720 

221 Asn Val Val Met Lys He Glu Pro Gly Asp Ala Lys Pro Clu Ser Phe Leu Vai Ala Ser 240 

721 CTT CTT GAT AAC TTC CTT AAT GCA TGG TCC CAT GAC CCT CTT CTT TTC GCA AAA TAT CCC 780 

241 Leu val Asp Lys Phe Val Asn Ala Trp Ser His Asp Pro Val Val Phe Gly Lys Tyr Pro 260 

781 CAA GAA GCA CTT CCA CTT TAT ACG GAA AAA GGG TTC CAA CTT CTC GAT AGC CAT ATG AAT 840 

261 Glu Glu Ala Val Ala Leu Tyr Thr Glu Lys Gly Leu Gin Val Leu Asp Ser Asp Met Asn 280 

841 ATT ATT TCG ACT CCT ATA CAC TTC TTT CCT CTC AAT TAT TAC ACA AGA ACA CTT CTT CTT 900 

281 He He Ser Thr Pro He Asp Phe Phe Gly Val Asn Tyr Tyr Thr Arg Thr Leu Val Val 300 

901 TTT GAT ATG AAC AAT CCT CTT GCA TTT TCC TAT CTT CAG GCA CAC CTT CCC AAA ACG GAG 960 

301 Phe Asp Met Asn Asn Pro Leu Gly Phe Ser Tyr Val Gin Gly Asp Leu Pro Lys Thr Glu 320 

961 ATC GCA TGG GAA ATC TAC CCC CAG GCA TTA TTT GAT ATG CTC CTC TAT CTC AAC GAA AGA 1020 

321 Met Gly Trp Glu He Tyr Pro Gin Gly Leu Phe Asp Met Leu Val Tyr Leu Lys Giu Arg 340 

1021 TAT AAA CTA CCA CTT TAT ATC ACA GAG AAC GGC ATC CCT GCA CCT CAT AAA TTC GAA AAC 1080 

34 1 Tyr Lys Leu Pro Leu Tyr lie Thr Glu Asn Gly Met Ala Gly Pro Asp Lys Leu Glu Asn 360 

1081 GCA ACA CTT CAT GAT AAT TAC CCA ATT CAA TAT TTC CAA AAG CAC TTT GAA AAA CCA CTT 1140 

361 Gly Arg Val His Asp Asn Tyr Arg He Glu Tyr Leu Glu Lys His Phe Glu Lys Ala Leu 380 

114 1 GAA CCA ATC AAT CCA GAT CTT GAT TTG AAA CCT TAC TTC ATT TCC TCT TTC ATC GAT AAC 1200 

)81 Clu Ala He Asn Ala Asp Val Asp Leu Lys Cly Tyr Phe lie Trp Ser Leu Met Asp Asn 400 

1201 TTC CAA TGG CCC TCC CCA TAC TCC AAA CCT TTC CCT ATA ATC TAC CTA CAT TAC AAT ACC 1260 

401 Phe Clu Trp aIa lys lily Tyr Ser Lys Arg Phe Cly lie He Tyr Val Asp Tyr Asn Thr 4/0 

12 6 1 CCA AAA ACC ATA TTC AAA CAT TCA CCC ATC TCC TTC AAG GAA TTT CTA AAA TCT TAA 13 17 

421 Pro Lys Arc/ ll.- l..-u Lys Asp Ser Ala Met Trp Leu Lys r; I vj Ph*. Leu Lys Ser End 4 I'l 
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STAPH YLOTHERMU3 KARINUS GLYCOSI DASE - L2C 
COMPLETE GENE SEQUENCE 
9/95 



1 Met Me Arq Pne Pro Asp Tyr Pne Leu Pht* c.\y THr Ala Thr Ser Ser His (.in t \ < 
6 1 GCT AAT AAC ATA TTT AAT GAT TCC TCC CAC TCC CAG ACT AAA GCC ACG ATT AA(i CTr 



4 1 



101 Gly He Glu Pro Val He Thr Lau Hi* His Phe Thr Asn Pro Gin Trp Phe Met Ly: 



221 



261 



301 



321 



34 1 



38 1 Glu Val Asp Tyr Lys Thr Phe Clu Arg Lys Pro Arg Lys Ser Ala Tyr Val Tyr Ser 
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Tn*r «oc o<<-^. t vn. 1 . r. ) v ; <;s \d*s c 3 18/0 



A TO CTA CCA GAA HGC TTT CTC TGC CuC HTC 
Mac Lrtu Pro C Lu Cly $>n* Leu Trp Cly vai 

61 CAC AAC CTC ACCi ACC AAC ATT GAT CCC AAC 

21 Asp Ly« L«u Ax? at a Aan lie asp rro Ain 

12 1 TTC AAC ATA AAC AGO GAA CTC CTC AGC CJC 

41 Ph* Aon lie lyi Axg Clu — cu val j«r Cly 

181 GAA CTT TAC GAG AAC CAT CAC CCC CTC L'CC 

61 CI j Leu Tyx Clu Lye Aap hu.e ax? Leu A.a 

241 CGA ATA GAG TOG AGC AGG ATC TTT CCC TGG 

81 Gly He Glu Trp sex Arg i.e Phe Pro Trp 

301 CGG GAC AGC TAC GGA CTC GTG AAC GAC GrC 

101 Arg Asp Ser Tyr Cly Uu Val Lys Asc Val 

361 CAC GAG ATA CCC AAT CAT CAC CAC ATA CCC 

12; Asp Clu ILe Al* Asn His Gin Clu I.e Ate 

421 GAG CTC CCC TTC AAC CTC ATC CTC AAC CZZ 

141 Glu Leu Gly Phe Ly» Val zle V*l Aan Leu 

461 CAT CCC ATA ATC CCC ACC CAC AAC CO? CTC 

161 Asp Pro He lie Ala Arg Clu Lys Ala _vu 

541 GAC ACC CTC CTC CAC TTC CCC AAO TAC COG 

181 Clu Ser Val Val Clu Ph*i Ala LyS Tyr Ala 

601 CTT CAT ATC TGC ACC ACC TTC AAC GAC CCC 

201 Val Asp Wee Trp Ser Thr Phe Aan siu Pro 

6 61 CCC TAC TCC GGC TTT CCC CCC GCC CTT ATG 

221 Pro Tyr Ser Gly Phe Pro Pro Gly val wet 

721 AAC ATG ATA AAC CCC CAC CCA CTC CCC TAC 

241 Aan Set He Asa Ala His Aid Leu AM Tyr 

781 CCC GAT AAC CAT TCC CGC TCC GAO GCC GAG 

2 61 Ala Asp -y« Asp Ser Arg Ser Glu Ala Glu 

• 4 1 GCC TAT CCA TAC GAC TCC AAC GAC CCA AAG 

2 81 Ala ryr Pro Tyr Asp Ser ah Aap rro Lys 

9 01- TTC CAC AGC GGC CTC TTC TTC GAC GCA ATC 

3 01 Phe Hi* Ser Gly Leu Phe pn* asp Ala He 

9 61 CCT CAC ACC TTC f*TC AAA CTT CCC CAT CTC 

321 Cly Clvi Thr Pha> Val Lye Val Arg hxm Leu 

021 TAC ACC ACA GAA CTC CTC ACC TAT TCC CAC 

3 41 Tyi Thr Arg Clu val val Arg Tyr Ser Clu 

OBI TTC CGG GGA CTT CAC AAC TAC GGC TAC GCC 

361 Phe Arg Cly val nxe An Tyr Gly Tyr A**« 

141 AGG CCC OTA AGC GAC ATC CCC TOO GAG ATC 

381 Arc Pro Val Scr Asp He Cly Trp tli lie 

2 01 GAG GCC AAC AAA TAC GGC CTC CCC CTT TAC 

401 Glu Ala Aan Lye Tyr Cly Val Pro via I Tyr 

2 61 CAC ACC CTG CCC CCG TAC TAC CTC GCO ACC 

421 Aap Thr Leu Arg Pro Tyr Tyr Leu Ale Scr 
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U^! CCG CCT TXC GkC 

441 Alj C^y Tyr Asp 

.J8i CTC CCT TTC AGO 

461 Lau Cly ?n» Ax? 

L 44 1 CCC COG CAC GAA 

461 Pro AX 9 Clu Clu 

150* GAA ATC CCC GAG 

*>01 Giu II* AX? Clu 



CTC ACC C'* -AC CT< taC 

v*l Arg G:y >r Leu Tyr 

ATG ACC TTC OGC CTC TAT 

««c Axg Pn« Gly L«>i Tyr 

ACC CTA AAG GTT TAT ACC 

Ser val Uf* Vai Tyr Ar9 

AAG TTC GGA CTT TC,A 

i.y« ?tw Gly -*u GLy End 



TGG CCG CTC xrC CAC AAC 
Trp Ala u«u The A*p A*n 

AAA GIL. GAT CTC ATA ACC 
L/i Val Asp- L«u I la Thr 

CCC ATC CTC CAC AAC AAC 
Cly I*u Val Clu Kan Aan 

V*0 



TAC GAG TGG CCC 

Tyr Clu Trp Ala 460 

AAC CAC AOA ACA 1440 

Uyu Olu Arg Thr 480 

GGA CTC AOC AAC 1500 

Cly val Ser Lys 500 
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CONPLCTZ OtNS SEQUtNCI - 9/95 

t ATC ATC CAC TCC CCG CTT AAA CCC ATT ATA TCT CAC CCT CCC CCC ATA A<_T ATC AC A ATA 60 

I Met. lie Mu Cya Pro Val Lys Gly lie lie Ser Glu Ala Arg Cly I tv Thr lie Thr lie 20 

61 CAT TTA ACT TTT CAA CCC CAA ATA AAT AAT TTG CTC AAT CCT ATC ATT (TTC TTT CCC CAC 120 

21 Asp Leu Ser Phe Cln Cly Gin lie Asr> Asn Leu Val Asa Ala Met t ie val Phe Pro Glu 40 

121 TTC TTC CTC TTT CCA ACC CCC ACA TCT TCT CAT CAC ATC CAC CCA CAT AAT AAA TCC AAC 180 

4 1 Phe Phe Leu phe Giy Thr Ala Thr Ser Ser His Cln lie Glu Gly Asp Asn Lys Trp Asn 60 

181 CAC TCC TCC TAT TAT GAC CAC ATA CCT AAC CTC CCC TAC AAA TCC CCT AAA CCC TCC AAT 2 40 

61 Asp Trp Trp Tyr Tyr Glu Glu lie Gly Lys Leu Pro Tyr Lys Ser Gly Lys Ala Cys Asn 80 

24 1 CAC TCC CAC CTT TAC ACC CAA GAT ATA GAC CTA ATC CCA CAC CTC CCC TAC AAT CCC TAC 300 

8 1 His Trp Glu Leu Tyr Arg Glu Asp He Glu Leu Met Ala Cln Leu Gly Tyr Asn Ala Tyr 100 

301 CCC TTT TCC ATA GAC TCC ACC CCT CTC TTC CCG CAA GAG CCC AAA TTC AAT CAA CAA CCC 360 

101 Arg Phe Ser He Clu Trp Ser Arg Leu Phe Pro Clu Clu Cly Lys Phe Asn Glu Glu Ala 120 

361 TTC AAC CCC TAC CCT CAA ATA ATT CAA ATC CTC CTT GAG AAC CCC ATT ACT CCA AAC CTT 420 

121 Phe Asn Arg Tyr Axg Glu He He Clu Xle Leu Leu Glu Lys Gly He Thr Pro Asn val 140 

421 ACA CTC CAC CAC TTC ACA TCA CCG CTC TCC TTC ATC CCG AAC CGA CCC TTT TTC AAC CAA 480 

141 Thr Leu His His Phe Thr Ser Pro Leu Trp Phe Met Axg Lys Gly Cly Phe Leu Lys Glu 160 

481 CAA AAC CTC AAC TAC TCG GAG CAC TAC CTT CAT AAA CCC CCC GAG CTC CTC AAG CCA CTC 540 

161 Glu Asn Leu Lys Tyr Trp Glu Cln Tyr Val Asp Lys Ala Ala Glu Leu Leu Lys Gly Val 180 

541 AAG CTT CTA CCT ACA TTC AAC CAC CCC ATC CTC TAT CTT ATC ATC CCC TAC CTC ACA CCC 600 

181 Lys Leu Val Ala Thr Phe Asn Glu Pro Met Val Tyr Val Met Met Gly Tyr Leu Thr Ala 200 

601 TAC TCG CCC CCC TTC ATC AAC ACT CCC TTT AAA CCC TTT AAA CTT CCC CCA AAC CTC CTT 660 

201 Tyr Trp Pro Pro Phe He Lya Sex Pro Phe Lys Ala Phe Lys Val Ala Ala Asa Leu Leu 220 

661 AAG CCC CAT CCA ATC CCA TAT CAT ATC CTC CAT CCT AAC TTT CAT CTC CCC ATA CTT AAA 720 

221 Lys Ala His Ala Met Ala Tyr Asp lis L«u His Gly Asn Ph* Asp Val Gly He Val Lys 240 

721 AAC ATC CCC ATA ATC CTC CCT CCA ACC AAC ACA CAC AAA CAC CTA CAA CCT CCC CAA AAC 780 

241 Asn He Pro He Met Leu Pro Ala Ser Asn Axg Glu Lys Asp Val Glu Ala Ala Gin Lys 260 

781 GCG CAT AAC CTC TTT AAC TCC AAC TTC CTT GAT CCA ATA TCC ACC CCA AAA TAT AAA CCA 840 

261 Ala Asp Asn Lau Phe Asn Trp Asn Phe L«u Asp Ala He Trp Ser Gly Lys Tyr Lys Gly 280 

841 CCT TTT CGA ACT TAC AAA ACT CCA CAA ACC GAT CCA CAC TTC ATA CCG ATA AAC TAC TAC 900 

281 Ala Phe Gly Thx Tyr Lys Thr Pro Clu Ser Asp Ala Asp Phe He Gly Xle Asn Tyr Tyr 300 

901 ACA CCC ACC GAC CTA ACC CAT ACC TCC AAT CCC CTA AAG TTT TTC TTC GAT CCC AAC CTT 960 

301 Thr Ala Ser Glu Val Arg His Ser Trp Asn Pro Leu Lys Phe Phe Phe Asp Ala Lys Leu 320 

961 CCA GAC TTA ACC GAG ACA AAA ACA CAT ATC CCT TCC ACT CTC TAT CCA AAG CCC ATA TAC 1020 

321 Ala Asp Leu Ser Glu Arg Lys Thr Asp Met Cly Trp Ser Val Tyr Pro Lys Gly He Tyr 340 

10 21 CAA CCT ATA CCA AAG CTT TCA CAC TAC CCA AAG CCA ATC TAC ATC ACQ CAA AAC CCG ATA 1080 

341 Clu Ala Xle Ala Lys Val Ser His Tyr Cly Lys Pro Met Tyr Xle Thr Glu Asn Cly He 360 

1081 CCT ACC TTA CAC GAT GAG TCC ACC ATA CAC TTT ATC ATC CAC CAC CTC CAC TAC CTT CAC 1140 

361 Ala Thr L*u Asp Asp Glu Trp Arg He Glu Phe He Xle Gin His Leu Gin Tyr Val His 380 

1141 AAA CCC TTA AAC GAT CCC TTT CAC TTC ACA CCC TAC TTC TAT TCC TCT TTT ATC GAT AAC 1200 

381 Lys Ala Leu Asn Asp Gly Phe Asp Leu Arg Gly Tyr Phe Tyr Trp Ser Phe Met Asp Asn 400 

1201 TTC GAC TCG CCT GAC CCT TTT ACA CCA CCC TTT CCC CTC CTC CAC CTC CAC TAC ACC ACC 1260 

401 Phe Clu Trp Ala Glu Gly Phe Arg Pro Arg Phe Cly Leu Val Clu Val Asp Tyr Thr Thr 4 20 

12 6) TTC AAG ACC ACA CCC ACA AAC ACT CCT TAC ATA TAT CGA CAA ATT CCA ACC CAA AAG AAA 13 20 

421 Phe Lys Arg Arg Pro Arg Lys Ser Ala Tyr Xle Tyr Gly Clu He Ala Arg Glu Lya Lys 440 

132 1 ATA AAA GAC CAA CTC CTC CCA AAC TAT CCC CTT CCC CAC CTA TCA 136S 

44 1 He Lys Asp Clu Leu Leu Ala Lys Tyr Cly Leu Pro Glu Leu 8nd 455 
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GAA ATG TGC GTG CGC AAT GTG AAT CCO ATG ACT ACC GCC ATC TGG TAT GCC TCC 
Glu Mat Cy» Val Arg Aan Val Aan Pro Mac Thr Thr Ala Ila Trp Tyr Ala Sar 

1197 1206 1215 1224 1233 1242 

ATG CTC GCC ACC TTC GCC GAT AAC GGC CTC GAA ATA TTC ACC CCA TGG TCC TGG 
Mat Leu oly Thr Pha Ala Aap Aan Gly Val Glu He Pha Thr Pro Trp Cya Trp 

1251 1260 1269 1278 1287 1296 

AAC ACC GGA ATG TGG GAA ACA CTC CAC CTC TTC AGC CCC TAC AAC AAA CCT TAT 
Aan Thr Gly Mat Trp Glu Thr Leu His Lau Pha Sar Arg Tyr Asn Lys Pro Tyr 

1305 1314 1323 1332 1341 1350 

CCO CTC GCC TCC AGC TCC ACT CTT GAA GAG TTT GTC ACC CCC TAC ACC TCC ATT 
Arg Val Ala Ser Ser Ser Ser Leu Glu Glu Phe Val Ser Ala Tyr Ser Ser He 

13S9 1368 1377 1386 1395 1404 

AAC GAA CCA GAA GAC GCC ATG ACG OTA CTT CTG GTG AAT CCT TCC ACT AGC GAC 
Asn Glu Ala Glu Aap Ala Mat Thr Val Lau Lau Val Asn Arg Ser Thr ser Glu 
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sen* la gouldi eodogluoanase (370P1) (continued) 

1413 1422 1431 1440 1449 1456 

ACC CAC ACC GCC ACT GTC GCT ATC GAC GAT TTC CCA CTG GAT GGC CCC TAC CGC 
Thr His Thr Ala Thr Val Ala lie Asp Asp Phe Pro Leu Asp Gly Pro Tyr Arg 

1467 1476 1485 1494 1503 1512 

ACC CTG CGC TTA CAC AAC CTG CCG GGG GAG GAA ACC TTC GTA TCT CAC CGA GAC 
Thr Leu Arg Leu His Asn Leu Pro Gly Qlu Glu Thr Phe Val Ser His Arg Asp 

1521 1530 1539 1548 1557 1566 

AAC GCC CTG GAA AAA GCT ACA GTG CGC GCC AGC GAC AAT ACG GTA ACA CTG GAG 
Asn Ala Leu Glu Lys Gly Thr Val Arg Ala Ser Asp Asn Thr Val Thr Leu Glu 

1575 1594 1593 1602 1611 

TTG CCC CCT CTG TCC GTT ACT GCA ATA TTG CTC AAG GCC CGO CCC TAA 3' 
Leu Pro Pro Leu Ser Val Thr Ala lis Leu Leu Lys Ala Arg Pro *♦* 
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5 ' CTC ATC TCT CTC GAA ATA TPC GOA AAC ACC TTC AHA CAT; OCA ACA TTC GTT CTC 

Val lie Cye Val Clu He Phe Gly L>n -mr Phe Arg ulu Gly Arg Phe Val Lou 
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Lys Glu Lys Asn Phe Thr val Glu Phe Ala Val Clu Lya lid His Leu Gly Trp 

117 126 135 144 153 162 

AAO ATC TCC GGC AGG GTC AAG CCA ACT OCG QGA AOG CTT GAG (7TT CTT CGA ACG 

Lys lie Ser Gly Arg Val Lys Gly Sex Pro Gly Arg Leu Glu Val Leu Arg Thr 

171 180 189 198 207 216 

AAA GCA CCG GAA AAG GTA CTT GTC AAC AAC TOG CAC TCC TOG QGA CCG TGC AGG 

Lys Ala Pro Glu Lys Val Leu Val Asn Asn Trp Gin Ser Trp Gly Pro Cys Arg 
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Hi«?f wyototja mant ima Aloha -oalacLosidaoc 

603 612 621 630 639 648 

GAT CTC ATC TOG GAA GAG ACX" CPC AAG AAC CTC AAG CTC GCC AAC AAT TTC CCC 



Asp L^eu Ttu* Trp Glu Clu Thx L«u Lyn Asn Leu Lys L^u Ala Lys Aon Phe Pro 

657 666 675 684 693 702 

TTC GAG CTC TTC CAG ATA GAC GAC GCC TAG GAA AAG GAC ATA GGT GAC TOC CTC 

Phm Glu Val Phe Gin lie Asp Asp Ala Tyx Glu Lys Asp XI e Gly Asp Trp Leu 

711 720 729 738 747 756 

OTG ACA AGA GGA GAC TTT CCA TOG GTG GAA GAG ATG OCA AAA OTT ATA OCG GAA 

Val Thr Ary Gly Asp Phe Pro Ser Val Glu Glu MeC Ala Lys Val He Ala Glu 

765 774 783 792 801 810 

AAC GOT TTC ATC COG GQC ATA TCG ACC CCC COG TIC ACT GTT TCT GAA AOC TCC 

Asn Gly Phe He Pro Gly He Trp Thz Ala Pro Phe Sex Val Sex Glu Thr Sex 

819 828 837 846 855 864 

GAT GTA TIC AAC GAA CAT CCG GAC TGG GTA CTC AAG GAA AAC GGA GAG CCG AAG 

Asp Val Phe Asn Glu His Pro Asp Trp Val Val Lys Glu Asn Gly Glu Pro Lys 

873 882 891 900 909 9X8 

ATC OCT TAC AGA AAC TOG AAC AAA AAG ATA TOC GCC CTC GAT CTT TCG AAA GAT 

Met Ala Tyr Ary Asn Trp Asn Lys Lys He Tyr Ala Leu Asp Leu Sex Lys Asp 

927 936 945 954 963 972 

GAG GTT CTC AAC TCG CTT TTC GAT CTC TTC TCA TCT CTC AGA AAG ATG OGC ISC 

Glu Val Leu Asn Trp Leu Phe Asp Leu Phe Ser Ser Leu Ary Lys Het Gly Tyr 

981 990 999 1008 10X7 1026 

AGG TAC TTC AAG ATC GAC TTT CTC TTC GOG GGT GCC CTT OCA GGA GAA AGA AAA 

Ary Tyr Phe Lys He Asp Phe Leu Phe Ala Gly Ala Val Pro Gly Glu Ary Lys 

1035 1044 1053 1062 1071 1080 

AAG AAC ATA ACA CCA ATT CAG GOG TTC AGA AAA GGG ATT GAG ACG ATC AGA AAA 
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1197 1206 121S 1224 L2.U 1242 

GAA CAT ATA C^A GAC AAC OGA OCT CCC GOT GCA ACA TOG GCG CTC AGA AAC OCC 



Glu His lie Glu Asp Asn Cly Ala Pro Ala Aid Arg Trp Ala Leu Arg Asn Ada 

L2S1 1260 1269 1278 1287 1296 

ATA ACG AGO TAC TTC ATC CAC GAC AOC TTC TOG CTC AAC GAC CCC OUT TOT CTC 

lie Tnr Arg Tyx- Pho Hat His Asp Arg Phe Trp Leu Aan Asp Pro Asp Cya Leu 

1305 1314 1023 1332 1341 1350 

ATA CTC AGA GAG GAG AAA ACG GAT CTC ACA CAG AAG GAA AAG GAG CTC TAC TOG 

He Leu Arg Glu Glu Lys Thr Asp Leu Thr Gin Lys Glu Lys Glu Lieu Tyr Sex 

1359 1368 1377 1306 1395 1404 

TAC ACG TCT GGA CTC CTC GAC AAC ATC ATC ATA GAA AGO GAT GAT CTC TCG CTC 

Tyr T&xr Cys Cly Val Leu Asp Asn Met He He Glu Ser Asp Asp Leu Sex Leu 

1413 1422 1431 1440 1449 1458 

GTC AGA GAT CAT GGA AAA AAG GTT CTC AAA GAA ACQ CTC QtA CTC CTC OCT GGA 

V&l Arg Asp His Gly Ly* Lys Val Leu Lys Glu Tfar Leu Glu Leu Leu Gly Cly 

1467 1476 1485 1494 1503 1512 

AGA CCA COG CTT CAA AAC ATC ATC TCG GAC GAT CTG AGA TAC GAG ATC CTC TOS 

Arg Pro Arg Val Gin Aan He Wet Ser Glu Asp Leu Arg Tyr Glu lie Val Sex 

1521 1530 1539 1548 1557 1566 

TCT GGC ACT CTC TCA OCA AAC GTC AAG ATC GTC CTC GAT CTC AAC AGC AGA GAG 

Ser Gly Thr Leu Ser Gly Asn Val Lya He Val val Aep Leu Asn Sex Arg Glu 

1S75 1584 1593 1602 1611 1620 

TAC CAC CTC GAA AAA GAA GGA AAG TCC TCC CTC AAA AAA AGA C7IC CTC AAA AGA 

Tyr His Leu Glu Lys Glu Gly Lys Ser Ser Leu Lya Lya Arg VaJ. Val Lys Arg 

1629 1638 1647 1656 1665 

GAA GAC GGA AGA AAC TTC TAC TTC TAC GAA GAG OCT GAG AGA GAA TCA 3* 

Glu Asp Gly Arg Asn Phe Tyr Fh» Tyr Glu Glu Gly Glu Arg Glu •*• 
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1625 1638 1647 16S6 1665 1674 

ATT GAA TCG AAC GGT GAG GTG GGA AAT GCA GCA CTG CAG CTG AAC GTG AAA CTG 

lie Glu Trp Asn Oly Glu Val Gly Asn Gly Ala Leu Gin Leu Asn Val Lys Leu 

1683 1692 1701 1710 1719 1728 

CCC GGA AAG AGC GAC TGG GAA GAA GTO AGA GTA GCA AGG AAG TTC GAA AGA CTC 

Pro Gly Lys Ser Asp Trp Glu Glu Val Arg Val Ala Arg Lys Phe ciu Arg Leu 

1737 1746 1755 1764 1773 1782 

TCA GAA TCT GAG ATC CTC GAG TAC GAC ATC TAC ATT CCA AAC GTC GAG GGA CTC 
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Phe He Asp Asn Val Arg Leu Tyr Lys Arg Thr Gly Gly Met *•* 
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Alii: la P-aaaaosldaee (6 3GB1) 
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ADXX 1* (J-Mn»osidase (C30B1) (ooatiau«d) 

S49 558 567 576 5B5 594 

AGG ACA CTT CTT GAG TTT GCC AAQ TAT OCT GCT TAC ATC GCC CAT GCG CTC GGA 

Arg Thr Val Val Glu Phe Ala Lys Tyr Ala Ala Tyr lie Ala His Ala Leu Gly 

603 612 621 630 639 648 

GAC CTC GTG GAC ACA TCG AGC ACC TTC AAC GAA CCT ATG GTA GTT GTO GAG CTC 

Asp Leu Val Asp Thr Trp Ser Thr Phe Asn Glu Pro Met Val Val Val Glu Leu 

657 666 675 684 693 702 

GGC TAC CTC GCC CCC TAC TCA GGA TTT CCC CCO GGA GTC ATG AAC CCC GAG GCC 

Gly Tyr Leu Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met Aan Pro Glu Ala 

711 720 729 738 747 756 

GCG AAG CTG GCO ATC CTC AAC ATG ATA AAC GCC CAC GCC TTG GCA TAT AAG ATG 

Ala Lys Leu Ala lie Leu Aan Met He Asn Ala His Ala Leu Ala Tyr Lys Met 

765 774 783 792 801 810 

ATA AAG AGG TTC GAC ACC AAG AAG GCC GAT GAG GAT AGC AAG TCC CCT GCG GAC 

He Lys Arg Phe Asp Thr Lys Lys Ala Asp Glu Asp Ser Lys Ser Pro Ala Asp 

819 828 637 846 855 864 

GTT GGC ATA ATT TAC AAC AAC ATC GCT GTT GCC TAC CCT AAA GAC CCT AAC GAT 

Val Gly He He Tyr Asn Asn lie Gly Val Ala Tyr Pro Lys Asp Pro Asn Asp 

873 882 891 900 909 918 

CCC AAG GAC GTT AAA GCA GCC GAA AAC GAC AAC TAC TTC CAC AOC GGA CTG TTC 

Pro Lys Asp val Lys Ala Ala Glu Asn Asp Asn Tyr Phe His Ser Gly Leu Phe 

927 936 945 954 963 972 

TTT GAT GCC ATC CAC AAG GGT AAG CTC AAC ATA GAG TTC GAC GCC GAA AAC TTT 

Phe Asp Ala He His Lys Gly Lys Leu Asn Ha Glu Phe Asp Gly Glu Asn Phe 

981 990 999 1008 1017 1026 

GTA AAA GTT AGA CAC CTA AAA GCC AAT GAC TGG ATA GGC CTC AAC TAC TAC ACC 

Val Lys Val Arg His Leu Lys Gly Asn Asp Trp He Gly Leu Asn Tyr Tyr Thr 

1035 1044 1053 1062 1071 1060 

CGC GAG GTT GTT AGA TAT TCG GAG CCC AAG TTC CCA ACT ATA CCC CTC ATA TCC 

Arg Glu Val Val Arg Tyr Ser Glu Pro Lys Phe Pro Ser He Pro Leu He Ser 

Figure 12 (Continued) 
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XMtZZ 1* f)-Maao«i.dai« (63QB1) (coatinutd) 

1089 1098 1107 1116 1125 1134 

TTC AAO GGC GTT CCC AAC TAC GGC TAC TCC TGC AGG CCC GGC ACQ ACC TCC GCC 

Phe Lys Gly Val Pro Asn Tyx Gly Tyr Ser Cya Arg Pro Cly Thr Thr Ser Ala 

1143 1152 1161 1170 1179 1109 

GAT GGC ATG CCC GTC AGC GAT ATC GGC TGG GAA GTC TAT CCC GAG GGA ATC TAC 

Asp Gly Het Pro Val Ser Asp lie Gly Trp Glu Val Tyr Pro Gin Gly lie Tyx 

1197 1206 1215 1224 1233 1242 

GAC TCG ATA GTC GAG GCC ACC AAG TAC ACT GTT CCT GTT TAC GTC ACC GAG AAC 

Asp Ser lie Val Glu Ala Thr Lys Tyr Ser Val Pro Val Tyr Val Thr Glu Asn 

1251 1260 1269 1278 1287 1296 

GOT GTT GCG GAT TCC GCG GAC ACG CTG AGG CCA TAC TAC ATA GTC AOC CAC GTC 

Gly Val Ala Axrp Ser Ala Asp Thr Leu Arg Pro Tyr Tyr He Val Ser His Val 

1305 1314 1323 1332 1341 1350 

TCA AAG ATA GAG GAA GCC ATT GAG AAT GGA TAC CCC GTA AAA GGC TAC ATG TAC 

Ser Lys He Glu Glu Ala He Glu Asn Gly Tyr Pro Val Lys Gly Tyr Met Tyr 

1359 1368 1377 1386 1395 1404 

TGG GCG CTT ACG GAT AAC TAC GAG TGG GCC CTC GGC TTC AOC ATG AGG TTT GGT 

Trp Ala Leu Thr Asp Asn Tyr Glu Trp Ala Leu Gly Phe Ser Met Arg Phe Gly 

1413 1422 1431 1440 1449 1458 

CTC TAC AAG GTC GAC CTC ATC TCC AAG GAG AGG ATC CCG AGG GAG AGA AGC GTT 

Leu Tyr Lys Val Asp Leu lie Ser Lys Glu Arg He Pro Arg Glu Arg Ser Val 

1467 1476 1485 1494 1503 1512 

GAG ATA TAT CGC AGG ATA CTG CAG TCC AAC GGT GTT CCT AAG GAT ATC AAA GAG 

Glu He Tyr Arg Arg He Val Gin Ser Asn Gly Val Pro Lys Asp He Lys Glu 

1521 1530 1539 

GAG TTC CTG AAG GGT GAG GAG AAA TGA 3' 

Glu Phe Leu Lys Gly Glu Glu Lys ••• 
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0C1/4V tadogXacaaat« (3 3GP1) 
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OC1/4V ladogluouttt (3 3QP1) (continued) 

549 558 567 576 585 594 

GAG CCT GCT CAG AAC TTG ACA GCT GAA AAA TGG AAC GCA CTT TAT CCA AAA GTC 

Glu Pro Ala Gin Asn Leu Thr Ala Glu Lys Trp Asn Ala Leu Tyr Pro Lys Val 

603 612 621 630 639 648 

CTC AAA GTT ATC AGG GAG AGC AAT CCA ACC CGG ATT GTC ATT ATC GAT GCT CCA 

Leu Lys Val lie Arg Glu Ser Asn Pro Thr Arg lie Val lie lie Asp Ala Pro 

657 666 675 684 693 702 

AAC TOG GCA CAC TAT AGC GCA GTG AGA AGT CTA AAA TTA GTC AAC GAC AAA CGC 

Aan Trp Ala His Tyr Ser Ala Val Arg Ser Leu Lys Leu Val Asn Asp Lya Arg 

711 720 729 738 747 756 

ATC ATT GTT TCC TTC CAT TAC TAC GAA CCT TTC AAA TTC ACA CAT CAG OCT GCC 

He He Val Ser Phe His Tyr Tyr Glu Pro Pbe Lys Phe Thr His Gin Gly Ala 

765 774 783 792 801 8X0 

GAA TOG GTT AAT CCC ATC CCA CCT GTT AGG GTT AAG TGG AAT GGC GAC GAA TOG 

Glu Trp Val Asn Pro He Pro Pro Val Arg Val Lys Trp Asn Gly Glu Glu Trp 

819 828 837 846 855 864 

GAA ATT AAC CAA ATC AGA AGT CAT TTC AAA TAC GTG AGT GAC TGG GCA AAG CAA 

Glu lie Asn Gin He Arg Ser His Phe Lys Tyr Val Ser Asp Trp Ala Lys Gin 

873 882 891 900 909 918 

AAT AAC CTA CCA ATC TTT CTT GGT GAA TTC GGT GCT TAT TCA AAA GCA GAC ATC 

Asn Asn Val Pro He Phe Leu Gly Glu Phe Gly Ala Tyr Ser Lys Ala Asp Met 

927 936 945 954 963 972 

GAC TCA AGG GTT AAG TGG ACC GAA AGT GTG AGA AAA ATG GCG GAA GAA TTT GGA 

Asp Ser Arg val Lys Trp Thr Glu Ser Val Arg Lys Met Ala Glu Glu Phe Gly 

981 990 999 1008 1017 1026 

TTT TCA TAC GCC TAT TGG GAA TTT TGT GCA GGA TTT GGC ATA TAC GAT AGA TGG 

Phe Ser Tyr Ala Tyr Trp Glu Phe Cys Ala Gly Phe Gly He Tyr Asp Arg Trp 

1035 1044 1053 1062 1071 1080 

TCT CAA AAC TOG ATC GAA CCA TTG GCA ACA GCT GTG GTT GGC ACA GGC AAA GAG 

Ser Gin Asn Trp He Glu Pro Leu Ala Thr Ala Val Val Gly Thr Gly Lys Glu 

TAA 3* 

• • * 

Figure 13 (Continued) 
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ThtraotogA BAritiAA PullalaMSe (60P3) 
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Tbtraetoffft maxitljfta PuXXuXaaase < COP 3 ) ( continued ) 

549 558 567 576 585 594 

AAA AAC GGA GAA GAC ACA GAA CCG TAC CAG GTT GTG AAC ATG GAA TAC AAG GGA 

Lys Asn Gly Glu Asp Thr Glu Pro Tyr Gin Val Val Asn Met Glu Tyr Lys Gly 

603 612 621 630 639 648 

AAC GGG GTC TOG GAA GCG GTT GTT GAA GGC GAT CTC GAC GGA GTG TTC TAC CTC 

Asn Gly Val Trp Glu Ala Val Val Glu Gly Asp Leu Asp Gly Val Phe Tyr Leu 

657 666 675 684 693 702 

TAT CAG CTG GAA AAC TAC GGA AAG ATC AGA ACA ACC GTC GAT CCT TAT TCG AAA 

Tyr Gin Leu Glu Asn Tyr Gly Lys He Arg Thr Thr Val Asp Pro Tyr Ser Lys 

7X1 720 729 738 747 756 

GCG GTT TAC GCA AAC AAC CAA GAG AGC GCC GTT GTG AAT CTT GCC AGO ACA AAC 

Ala Val Tyr Ala Asn Asn Gin Glu Ser Ala Val Val Asn Leu Ala Ary Thr Asn 

765 774 783 792 801 810 

CCA GAA GGA TOG GAA AAC GAC AGG GGA CCG AAA ATC GAA GGA TAC GAA GAC GCG 

Pro Glu Gly Trp Glu Asn Asp Arg Gly Pro Lys He Glu Gly Tyr Glu Asp Ala 

819 82B 837 846 855 864 

ATA ATC TAT GAA ATA CAC ATA GCG GAC ATC ACA GGA CTC GAA AAC TCC GGG GTA 

lie He Tyr Glu He His lie Ala Asp He Thr Gly Leu Glu Asn Sex Gly Val 

873 882 891 900 909 9X8 

AAA AAC AAA GGC CTC TAT CTC GGG CTC ACC GAA GAA AAC ACG AAA GGA CCG GGC 

Lys Asn Lys Gly Leu Tyr Leu Gly Leu Thr Glu Glu Asn Thr Lys Gly Pro Gly 

927 936 945 954 963 972 

GGT GTG ACA ACA GGC CTT TCG CAC CTT GTG GAA CTC GGT CTT ACA CAC GTT CAT 

Gly Val Thr Thr Gly Leu Ser His Leu Val Glu Leu Gly Val Thr His Val His 

981 990 999 1008 1017 1026 

ATA CTT CCT TTC TTT GAT TTC TAC ACA GGC GAC GAA CTC GAT AAA GAT TTC GAG 

lie Leu Pro Phe Phe Asp Phe Tyr Thr Gly Asp Glu Leu Asp Lys Asp Phe Glu 

1035 1044 1053 1062 1071 1080 

AAG TAC TAC AAC TGG GOT TAC GAT CCT TAC CTG TTC ATG GTT CCG GAG GGC AGA 

Lys Tyr Tyr Asn Trp Gly Tyr Asp Pro Tyr Leu Phe Met Val Pro Glu Gly Arg 

Figure 14 ( Continued ) 



WO 97/25417 



31/32 



PCT/US97/0009 



Th.rmotoga uritlat Pallulaji*.. <co»9> <co»tin»«d) 
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Thermotoga aaribim* Pullulanas* (60P3) (contlna«d) 
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